Pipeline Parameters

Pipeline parameters let you control multiple nodes with a single value. Instead of updating each node individually, define shared parameters that automatically propagate to their targets.

When to use pipeline parameters

Use pipeline parameters when:

  • Multiple nodes need the same configuration value (e.g., batch size, data version)

  • You want to experiment with different settings across an entire pipeline

  • You need centralized control over distributed processing parameters

Configure in valohai.yaml

Define pipeline parameters with their target nodes in your valohai.yaml:

- step:
    name: preprocess-dataset
    image: python:3.9
    command:
      - pip install numpy valohai-utils
      - python ./preprocess_dataset.py
    parameters: 
      - name: exec_id
        type: string
      - name: filters
        type: string
        default: ["low-pass"]
    inputs:
      - name: dataset
        default: https://valohaidemo.blob.core.windows.net/mnist/mnist.npz

- step:
    name: train-model
    image: tensorflow/tensorflow:2.6.0
    command:
      - pip install valohai-utils
      - python ./train_model.py {parameters}
    parameters:
      - name: exec_id
        type: string
      - name: train_param
        type: integer
        default: 5

- pipeline:
    name: shared-parameters-example
    parameters:
      - name: id
        targets:
          - preprocess.parameters.exec_id
          - train.parameters.exec_id
      - name: training_parameter
        targets:
          - train.parameters.train_param
        default: 3
      - name: filters
        target: preprocess.parameters.filters
        default: ["remove-outliers", "normalize"]
    nodes:
      - name: preprocess
        step: preprocess-dataset
        type: execution
      - name: train
        step: train-model
        type: execution
      - name: train_in_task
        step: train-model
        type: task
    edges:
      - [preprocess.output.preprocessed_mnist.npz, train.input.dataset]

Key concepts

Target syntax: <node-name>.parameters.<step-parameter-name>

Multiple targets: One pipeline parameter can override multiple node parameters:

- name: id
  targets:
    - preprocess.parameters.exec_id
    - train.parameters.exec_id

Selective targeting: Nodes without targets keep their default values. In the example, train_in_task uses the step's default exec_id, not the pipeline parameter.

Access in your code

Pipeline parameters work exactly like regular parameters in your code:

Command-line parsing

import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--exec_id', type=str)
parser.add_argument('--train_param', type=int, default=5)
args = parser.parse_args()

Python with valohai-utils

import valohai

# Access the parameter value - doesn't matter if it's pipeline or node parameter
exec_id = valohai.parameters("exec_id").value
filters = valohai.parameters("filters").value

💡 Your code doesn't need to know whether a parameter comes from the pipeline or node level.

Multi-value parameters for tasks

Pipeline parameters can distribute multiple values across task nodes:

- pipeline:
    name: parallel-processing
    parameters:
      - name: task_configs
        target: process_batch.parameters.config_id
        default: [100, 200, 300, 400]
    nodes:
      - name: process_batch
        step: batch-processor
        type: task  # Creates 4 parallel executions

Each value creates a separate execution:

  • Execution 1: config_id=100

  • Execution 2: config_id=200

  • Execution 3: config_id=300

  • Execution 4: config_id=400

Web interface behavior

In the web interface:

  1. Shared Parameters section shows all pipeline parameters and their targets

  2. Overridden parameters appear grayed out in node configurations

  3. Non-targeted parameters remain editable at the node level

  4. Task nodes allow multiple value entry for parallel execution

Common patterns

Experiment tracking

Share a unique ID across all nodes for unified logging:

- name: experiment_id
  targets:
    - preprocess.parameters.exp_id
    - train.parameters.exp_id
    - evaluate.parameters.exp_id

Resource scaling

Adjust compute resources uniformly:

- name: batch_size
  targets:
    - preprocess.parameters.batch_size
    - train.parameters.batch_size
  default: 32

Next steps

Last updated

Was this helpful?