Run multiple pipeline instances in parallel
Launch multiple instances of your pipeline, each with different parameter configurations. This pattern scales your ML workflows across multiple contexts: different factories, regions, customers, or any dimension that requires isolated pipeline runs.
When to use parallel pipelines
Parallel pipeline runs solve the multi-context problem: when you need the same workflow executed independently for different scenarios.
Common scenarios:
Multi-site deployments: Train site-specific models (factories, stores, regions)
Customer-specific models: Fine-tune base models for different clients
A/B testing pipelines: Run competing pipeline configurations side-by-side
Multi-domain training: Apply the same pipeline to different data domains
💡 Parallel pipelines vs. task nodes: Use parallel pipelines when each context needs its own complete pipeline. Use task nodes when you want parallel execution within a single pipeline.
Prerequisites
Before creating parallel pipeline runs:
Define at least one pipeline parameter in your
valohai.yamlPush your pipeline configuration to a Git repository
Fetch the latest commit in your Valohai project
Example: Multi-factory pipeline
Here's a pipeline that processes factory-specific data. The factory_id parameter creates separate pipeline instances:
- step:
name: preprocess_factory_data
image: python:3.10
command:
- pip install valohai-utils
- python ./preprocess.py
parameters:
- name: factory_id
type: string
default: "factory_001"
- name: quality_threshold
type: float
default: 0.95
inputs:
- name: raw_data
default: s3://data/factories/{parameter:factory_id}/raw/
- step:
name: train_quality_model
image: python:3.10
command:
- pip install valohai-utils
- python ./train.py
parameters:
- name: factory_id
type: string
default: "factory_001"
- name: model_type
type: string
default: "quality_inspector"
inputs:
- name: training_data
optional: true
- pipeline:
name: Factory Quality Pipeline
parameters:
- name: factory_identifier
targets:
- preprocess.parameters.factory_id
- train.parameters.factory_id
default: "factory_001"
nodes:
- name: preprocess
step: preprocess_factory_data
type: execution
- name: train
step: train_quality_model
type: execution
override:
inputs:
- name: training_data
edges:
- [preprocess.output.processed_data*, train.input.training_data]Create parallel pipelines in the UI
Transform your single pipeline into multiple parallel runs with different configurations:
Step-by-step setup
Navigate to pipelines
Open your project
Click the Pipelines tab
Click Create Pipeline
Select your blueprint
Choose your commit
Select the pipeline with parameters
Configure for parallel execution
Scroll to Pipeline Parameters
Change dropdown from "Single Pipeline" to "Pipeline Task"
Toggle Variant to "on"
Define parameter variations
Enter each parameter value on a new line
Each value creates a separate pipeline instance
Launch the pipelines
Click Create Pipeline
View your Pipeline Task containing all instances
Configuration example
For example, three factory IDs create three independent pipelines:
Pipeline 1:
factory_berlinPipeline 2:
factory_munichPipeline 3:
factory_hamburg
Each pipeline runs the complete workflow with its specific factory context.
Parameter configuration patterns
Simple list (one parameter)
factory_001
factory_002
factory_003Multiple parameters
When you have multiple pipeline parameters, Valohai creates pipelines for every combination:
Parameter 1:
region→ europe, asiaParameter 2:
model_size→ small, large
Result: 4 pipelines (europe-small, europe-large, asia-small, asia-large)
Dynamic parameter usage
Use parameters in your steps to create context-specific behavior:
- step:
name: train_step
image: python:3.10
command:
- pip install valohai-utils
- python ./train.py
parameters:
- name: factory_id
type: string
default: "factory_001"
inputs:
- name: dataset
default: dataset://{parameter:factory_id}/latest # dataset://factory_001/latest
optional: trueCombine with task nodes
You can use both patterns together: parallel pipelines where each pipeline contains task nodes for hyperparameter tuning:
- pipeline:
name: Factory ML Pipeline
parameters:
- name: factory_id
targets:
- train.parameters.factory_id
nodes:
- name: train
type: task # Each factory runs hyperparameter tuning
step: train_modelThis creates multiple factory-specific pipelines, each running hyperparameter optimization.
Last updated
Was this helpful?
