Run parallel executions within a pipeline

Use task nodes to run multiple executions in parallel within your pipeline. Perfect for hyperparameter tuning, parameter sweeps, evaluating multiple datasets/models in parallel, or any scenario where you need to explore multiple configurations as part of your ML workflow.

When to use task nodes

Task nodes solve a common ML workflow pattern: running the same step multiple times with different parameters. Instead of creating separate pipelines or manually orchestrating parallel runs, a single task node handles it all.

Common scenarios:

Hyperparameter optimization: Test multiple learning rates, batch sizes, or model architectures
Cross-validation: Run k-fold validation as part of your pipeline
Multi-configuration training: Train models with different preprocessing options

💡 Task node vs. parallel pipelines: Use task nodes when you want parallel execution within a pipeline. Use parallel pipeline runs when you need to run entire pipelines with different configurations.

How task nodes work

A task node spawns multiple executions of the same step, each potentially with different parameter values. All outputs from these executions flow to the next pipeline node.

- pipeline:
    name: Training Pipeline
    nodes:
      - name: preprocess
        type: execution
        step: Preprocess dataset (MNIST)
      - name: train
        type: task  # This node runs multiple executions
        step: Train model (MNIST)
        override:
          inputs:
              - name: training-set-images
              - name: training-set-labels
              - name: test-set-images
              - name: test-set-labels
      - name: evaluate
        type: execution
        step: Batch inference (MNIST)
    edges:
    - [preprocess.output.*train-images*, train.input.training-set-images]
    - [preprocess.output.*train-labels*, train.input.training-set-labels]
    - [preprocess.output.*test-images*, train.input.test-set-images]
    - [preprocess.output.*test-labels*, train.input.test-set-labels]
    - [train.output.model*, evaluate.input.model]

Handle task failures

Task nodes need clear rules for handling failures since multiple executions run in parallel. Configure the on-error property to control pipeline behavior when executions fail.

Error handling options

Option

Behavior

Use when

stop-all

Stop the entire pipeline if any execution fails (default)

Every execution must succeed

continue

Continue despite failures if at least one execution succeeds

You expect some parameter combinations to fail

stop-next

Stop downstream nodes but let parallel branches continue

You have independent pipeline branches

Example: Robust hyperparameter search

This pipeline runs two parallel hyperparameter searches with different failure strategies:

- pipeline:
    name: Dual Training Pipeline
    nodes:
      - name: preprocess
        type: execution
        step: preprocess-dataset
      - name: train_conservative
        type: task
        on-error: stop-next  # Fail fast for critical path
        step: train-model
        override:
          inputs:
            - name: dataset
      - name: evaluate_conservative
        type: execution
        step: batch-inference
      - name: train_experimental
        type: task
        on-error: continue  # Allow experimental configs to fail
        step: train-model
        override:
          inputs:
            - name: dataset
      - name: evaluate_experimental
        type: execution
        step: batch-inference
    edges:
      - [preprocess.output.preprocessed_mnist.npz, train_conservative.input.dataset]
      - [preprocess.output.preprocessed_mnist.npz, train_experimental.input.dataset]
      - [train_conservative.output.model*, evaluate_conservative.input.model]
      - [train_experimental.output.model*, evaluate_experimental.input.model]

Result: Conservative training stops the evaluation if any execution fails. Experimental training continues to evaluation as long as one configuration succeeds.

Handle task outputs

Task nodes pass all outputs to downstream nodes, but there's a critical limitation to understand.

Warning: If multiple executions produce outputs with identical filenames, only one file (chosen randomly) passes to the next node.

Best practices for output naming

Avoid: All executions saving model.pkl Better: Include parameters in filename: model_lr0.01_batch32.pkl

Python example with valohai-utils:

import valohai

# Get parameter values
lr = valohai.parameters('learning_rate').value
batch_size = valohai.parameters('batch_size').value

# Create unique output filename
model_filename = f"model_lr{lr}_batch{batch_size}.pkl"
valohai.outputs().live(model_filename)

Create task nodes in the UI

Convert any execution node with parameters into a task node directly in the pipeline builder:

Open your project's Pipelines tab
Click Create Pipeline
Select your pipeline blueprint
Click on any node that has parameters
Click Convert to task below the graph
Configure your parameter grid in the Parameters section
Click Create pipeline

PreviousRun multiple pipeline instances in parallel NextData

Last updated 27 days ago

Was this helpful?