Run multiple pipeline instances in parallel

Launch multiple instances of your pipeline, each with different parameter configurations. This pattern scales your ML workflows across multiple contexts: different factories, regions, customers, or any dimension that requires isolated pipeline runs.

When to use parallel pipelines

Parallel pipeline runs solve the multi-context problem: when you need the same workflow executed independently for different scenarios.

Common scenarios:

  • Multi-site deployments: Train site-specific models (factories, stores, regions)

  • Customer-specific models: Fine-tune base models for different clients

  • A/B testing pipelines: Run competing pipeline configurations side-by-side

  • Multi-domain training: Apply the same pipeline to different data domains

💡 Parallel pipelines vs. task nodes: Use parallel pipelines when each context needs its own complete pipeline. Use task nodes when you want parallel execution within a single pipeline.

Prerequisites

Before creating parallel pipeline runs:

  1. Define at least one pipeline parameter in your valohai.yaml

  2. Push your pipeline configuration to a Git repository

  3. Fetch the latest commit in your Valohai project

Example: Multi-factory pipeline

Here's a pipeline that processes factory-specific data. The factory_id parameter creates separate pipeline instances:

- step:
    name: preprocess_factory_data
    image: python:3.10
    command:
      - pip install valohai-utils
      - python ./preprocess.py
    parameters:
      - name: factory_id
        type: string
        default: "factory_001"
      - name: quality_threshold
        type: float
        default: 0.95
    inputs:
      - name: raw_data
        default: s3://data/factories/{parameter:factory_id}/raw/
        
- step:
    name: train_quality_model
    image: python:3.10
    command:
      - pip install valohai-utils
      - python ./train.py
    parameters:
      - name: factory_id
        type: string
        default: "factory_001"
      - name: model_type
        type: string
        default: "quality_inspector"
    inputs:
      - name: training_data
        optional: true
 
- pipeline:
    name: Factory Quality Pipeline
    parameters:
      - name: factory_identifier
        targets:
          - preprocess.parameters.factory_id
          - train.parameters.factory_id
        default: "factory_001"
    nodes:
      - name: preprocess
        step: preprocess_factory_data
        type: execution
      - name: train
        step: train_quality_model
        type: execution
        override:
          inputs:
            - name: training_data
    edges:
      - [preprocess.output.processed_data*, train.input.training_data]

Create parallel pipelines in the UI

Transform your single pipeline into multiple parallel runs with different configurations:

Step-by-step setup

  1. Navigate to pipelines

    • Open your project

    • Click the Pipelines tab

    • Click Create Pipeline

  2. Select your blueprint

    • Choose your commit

    • Select the pipeline with parameters

  3. Configure for parallel execution

    • Scroll to Pipeline Parameters

    • Change dropdown from "Single Pipeline" to "Pipeline Task"

    • Toggle Variant to "on"

  4. Define parameter variations

    • Enter each parameter value on a new line

    • Each value creates a separate pipeline instance

  5. Launch the pipelines

    • Click Create Pipeline

    • View your Pipeline Task containing all instances

Configuration example

For example, three factory IDs create three independent pipelines:

  • Pipeline 1: factory_berlin

  • Pipeline 2: factory_munich

  • Pipeline 3: factory_hamburg

Each pipeline runs the complete workflow with its specific factory context.

Parameter configuration patterns

Simple list (one parameter)

factory_001
factory_002
factory_003

Multiple parameters

When you have multiple pipeline parameters, Valohai creates pipelines for every combination:

  • Parameter 1: region → europe, asia

  • Parameter 2: model_size → small, large

Result: 4 pipelines (europe-small, europe-large, asia-small, asia-large)

Dynamic parameter usage

Use parameters in your steps to create context-specific behavior:

- step:
    name: train_step
    image: python:3.10
    command:
      - pip install valohai-utils
      - python ./train.py
    parameters:
      - name: factory_id
        type: string
        default: "factory_001"
    inputs:
      - name: dataset
        default: dataset://{parameter:factory_id}/latest # dataset://factory_001/latest
        optional: true

Combine with task nodes

You can use both patterns together: parallel pipelines where each pipeline contains task nodes for hyperparameter tuning:

- pipeline:
    name: Factory ML Pipeline
    parameters:
      - name: factory_id
        targets:
          - train.parameters.factory_id
    nodes:
      - name: train
        type: task  # Each factory runs hyperparameter tuning
        step: train_model

This creates multiple factory-specific pipelines, each running hyperparameter optimization.

Last updated

Was this helpful?