Run parallel executions within a pipeline
Use task nodes to run multiple executions in parallel within your pipeline. Perfect for hyperparameter tuning, parameter sweeps, evaluating multiple datasets/models in parallel, or any scenario where you need to explore multiple configurations as part of your ML workflow.
When to use task nodes
Task nodes solve a common ML workflow pattern: running the same step multiple times with different parameters. Instead of creating separate pipelines or manually orchestrating parallel runs, a single task node handles it all.
Common scenarios:
Hyperparameter optimization: Test multiple learning rates, batch sizes, or model architectures
Cross-validation: Run k-fold validation as part of your pipeline
Multi-configuration training: Train models with different preprocessing options
💡 Task node vs. parallel pipelines: Use task nodes when you want parallel execution within a pipeline. Use parallel pipeline runs when you need to run entire pipelines with different configurations.
How task nodes work
A task node spawns multiple executions of the same step, each potentially with different parameter values. All outputs from these executions flow to the next pipeline node.
- pipeline:
name: Training Pipeline
nodes:
- name: preprocess
type: execution
step: Preprocess dataset (MNIST)
- name: train
type: task # This node runs multiple executions
step: Train model (MNIST)
override:
inputs:
- name: training-set-images
- name: training-set-labels
- name: test-set-images
- name: test-set-labels
- name: evaluate
type: execution
step: Batch inference (MNIST)
edges:
- [preprocess.output.*train-images*, train.input.training-set-images]
- [preprocess.output.*train-labels*, train.input.training-set-labels]
- [preprocess.output.*test-images*, train.input.test-set-images]
- [preprocess.output.*test-labels*, train.input.test-set-labels]
- [train.output.model*, evaluate.input.model]Handle task failures
Task nodes need clear rules for handling failures since multiple executions run in parallel. Configure the on-error property to control pipeline behavior when executions fail.
Error handling options
stop-all
Stop the entire pipeline if any execution fails (default)
Every execution must succeed
continue
Continue despite failures if at least one execution succeeds
You expect some parameter combinations to fail
stop-next
Stop downstream nodes but let parallel branches continue
You have independent pipeline branches
Example: Robust hyperparameter search
This pipeline runs two parallel hyperparameter searches with different failure strategies:
- pipeline:
name: Dual Training Pipeline
nodes:
- name: preprocess
type: execution
step: preprocess-dataset
- name: train_conservative
type: task
on-error: stop-next # Fail fast for critical path
step: train-model
override:
inputs:
- name: dataset
- name: evaluate_conservative
type: execution
step: batch-inference
- name: train_experimental
type: task
on-error: continue # Allow experimental configs to fail
step: train-model
override:
inputs:
- name: dataset
- name: evaluate_experimental
type: execution
step: batch-inference
edges:
- [preprocess.output.preprocessed_mnist.npz, train_conservative.input.dataset]
- [preprocess.output.preprocessed_mnist.npz, train_experimental.input.dataset]
- [train_conservative.output.model*, evaluate_conservative.input.model]
- [train_experimental.output.model*, evaluate_experimental.input.model]Result: Conservative training stops the evaluation if any execution fails. Experimental training continues to evaluation as long as one configuration succeeds.
Handle task outputs
Task nodes pass all outputs to downstream nodes, but there's a critical limitation to understand.
Warning: If multiple executions produce outputs with identical filenames, only one file (chosen randomly) passes to the next node.
Best practices for output naming
Avoid: All executions saving model.pkl
Better: Include parameters in filename: model_lr0.01_batch32.pkl
Python example with valohai-utils:
import valohai
# Get parameter values
lr = valohai.parameters('learning_rate').value
batch_size = valohai.parameters('batch_size').value
# Create unique output filename
model_filename = f"model_lr{lr}_batch{batch_size}.pkl"
valohai.outputs().live(model_filename)Create task nodes in the UI
Convert any execution node with parameters into a task node directly in the pipeline builder:
Open your project's Pipelines tab
Click Create Pipeline
Select your pipeline blueprint
Click on any node that has parameters
Click Convert to task below the graph
Configure your parameter grid in the Parameters section
Click Create pipeline
Last updated
Was this helpful?
