Configuration Files

Valohai generates two configuration files during Distributed Task executions: /valohai/config/distributed.json and /valohai/config/distributed.yaml.

Both files contain identical information in different formats. These files are only present in executions that are part of a Distributed Task.

Reading configuration files

The valohai-utils Python package includes helpers under valohai.distributed to read and use these configuration files.

You can also parse the files manually if you're not using Python or prefer direct access.

Configuration structure

Top-level fields

Field

Type

Description

config

Object

Task-level configuration (group name, member ID, worker count)

members

Array

List of all workers in the Distributed Task

self

Object

Details about the current worker (duplicate of this worker's entry in members)

config object

Field

Type

Description

group_name

String

Distributed group identifier, typically derived from the Task ID

member_id

String

Unique identifier for the worker reading this configuration file

required_count

Integer

Total number of workers expected in this Distributed Task

members array

Each object in members represents one worker in the Distributed Task.

Field

Type

Description

announce_time

String

ISO 8601 timestamp when the worker joined the group

identity

String

Machine identifier (depends on infrastructure)

job_id

String

Execution identifier for this worker

member_id

String

Unique member identifier (typically a number as a string: "0", "1", "2")

network

Object

Network connection details (see below)

network object

Field

Type

Description

exposed_ports

Object

Map of host port to container port (e.g., {"1234": "1234"}). Empty if all ports are exposed via VH_DOCKER_NETWORK=host.

local_ips

Array

List of local IP addresses to reach this worker

public_ips

Array

List of public IP addresses to reach this worker (if available)

self object

The self object is a convenience field that duplicates the current worker's entry from the members array. It has the same structure as a members object.

Example configuration file

{
  "config": {
    "group_name": "task-0180f5a9-9ffe-4e09-d5a7-9a0a507019d4",
    "member_id": "0",
    "required_count": 3
  },
  "members": [
    {
      "announce_time": "2022-05-24T10:42:57",
      "identity": "happy-yjaqaqlx",
      "job_id": "exec-0180f5a9-a002-45a0-f0e6-8e98720eeaad",
      "member_id": "0",
      "network": {
        "exposed_ports": {
          "1234": "1234"
        },
        "local_ips": [
          "10.0.16.61"
        ],
        "public_ips": [
          "34.121.32.110"
        ]
      }
    },
    {
      "announce_time": "2022-05-24T10:42:58",
      "identity": "happy-kwfncqxe",
      "job_id": "exec-0180f5a9-a007-633b-8af3-e11593482653",
      "member_id": "2",
      "network": {
        "exposed_ports": {
          "1234": "1234"
        },
        "local_ips": [
          "10.0.16.60"
        ],
        "public_ips": [
          "34.134.18.149"
        ]
      }
    },
    {
      "announce_time": "2022-05-24T10:42:57",
      "identity": "happy-tcuaezxm",
      "job_id": "exec-0180f5a9-a005-f2ef-693a-3b4c4c115ed8",
      "member_id": "1",
      "network": {
        "exposed_ports": {
          "1234": "1234"
        },
        "local_ips": [
          "10.0.16.59"
        ],
        "public_ips": [
          "35.194.55.255"
        ]
      }
    }
  ],
  "self": {
    "announce_time": "2022-05-24T10:42:57",
    "identity": "happy-yjaqaqlx",
    "job_id": "exec-0180f5a9-a002-45a0-f0e6-8e98720eeaad",
    "member_id": "0",
    "network": {
      "exposed_ports": {
        "1234": "1234"
      },
      "local_ips": [
        "10.0.16.61"
      ],
      "public_ips": [
        "34.121.32.110"
      ]
    }
  }
}

Using configuration in Python

The valohai-utils package provides helpers to read this configuration:

import valohai

# Get all workers
members = valohai.distributed.members()
for member in members:
    print(member.primary_local_ip)

# Get the master worker (first to announce)
master = valohai.distributed.master()
master_url = f"tcp://{master.primary_local_ip}:1234"

# Get current worker details
me = valohai.distributed.me()
rank = valohai.distributed.rank
size = valohai.distributed.required_count

Using configuration manually

Read the JSON or YAML file directly:

import json

with open('/valohai/config/distributed.json') as f:
    config = json.load(f)

# Get current worker's member_id
my_member_id = config['config']['member_id']

# Get all worker IPs
worker_ips = [m['network']['local_ips'][0] for m in config['members']]

# Get master worker (member_id "0")
master = next(m for m in config['members'] if m['member_id'] == "0")
master_ip = master['network']['local_ips'][0]

Next steps

Last updated

Was this helpful?