# Run parallel executions within a pipeline

Use task nodes to run multiple executions in parallel within your pipeline. Perfect for hyperparameter tuning, parameter sweeps, evaluating multiple datasets/models in parallel, or any scenario where you need to explore multiple configurations as part of your ML workflow.

### When to use task nodes

Task nodes solve a common ML workflow pattern: running the same step multiple times with different parameters. Instead of creating separate pipelines or manually orchestrating parallel runs, a single task node handles it all.

**Common scenarios:**

* **Hyperparameter optimization**: Test multiple learning rates, batch sizes, or model architectures
* **Cross-validation**: Run k-fold validation as part of your pipeline
* **Multi-configuration training**: Train models with different preprocessing options

> 💡 **Task node vs. parallel pipelines**: Use task nodes when you want parallel execution *within* a pipeline. Use [parallel pipeline runs](/pipelines/parallel-runs-in-a-pipeline/parallel-pipeline-runs.md) when you need to run entire pipelines with different configurations.

### How task nodes work

A task node spawns multiple executions of the same step, each potentially with different parameter values. All outputs from these executions flow to the next pipeline node.

```yaml
- pipeline:
    name: Training Pipeline
    nodes:
      - name: preprocess
        type: execution
        step: Preprocess dataset (MNIST)
      - name: train
        type: task  # This node runs multiple executions
        step: Train model (MNIST)
        override:
          inputs:
              - name: training-set-images
              - name: training-set-labels
              - name: test-set-images
              - name: test-set-labels
      - name: evaluate
        type: execution
        step: Batch inference (MNIST)
    edges:
    - [preprocess.output.*train-images*, train.input.training-set-images]
    - [preprocess.output.*train-labels*, train.input.training-set-labels]
    - [preprocess.output.*test-images*, train.input.test-set-images]
    - [preprocess.output.*test-labels*, train.input.test-set-labels]
    - [train.output.model*, evaluate.input.model]
```

### Handle task failures

Task nodes need clear rules for handling failures since multiple executions run in parallel. Configure the `on-error` property to control pipeline behavior when executions fail.

#### Error handling options

| Option      | Behavior                                                     | Use when                                       |
| ----------- | ------------------------------------------------------------ | ---------------------------------------------- |
| `stop-all`  | Stop the entire pipeline if any execution fails (default)    | Every execution must succeed                   |
| `continue`  | Continue despite failures if at least one execution succeeds | You expect some parameter combinations to fail |
| `stop-next` | Stop downstream nodes but let parallel branches continue     | You have independent pipeline branches         |

#### Example: Robust hyperparameter search

This pipeline runs two parallel hyperparameter searches with different failure strategies:

```yaml
- pipeline:
    name: Dual Training Pipeline
    nodes:
      - name: preprocess
        type: execution
        step: preprocess-dataset
      - name: train_conservative
        type: task
        on-error: stop-next  # Fail fast for critical path
        step: train-model
        override:
          inputs:
            - name: dataset
      - name: evaluate_conservative
        type: execution
        step: batch-inference
      - name: train_experimental
        type: task
        on-error: continue  # Allow experimental configs to fail
        step: train-model
        override:
          inputs:
            - name: dataset
      - name: evaluate_experimental
        type: execution
        step: batch-inference
    edges:
      - [preprocess.output.preprocessed_mnist.npz, train_conservative.input.dataset]
      - [preprocess.output.preprocessed_mnist.npz, train_experimental.input.dataset]
      - [train_conservative.output.model*, evaluate_conservative.input.model]
      - [train_experimental.output.model*, evaluate_experimental.input.model]
```

**Result**: Conservative training stops the evaluation if any execution fails. Experimental training continues to evaluation as long as one configuration succeeds.

### Handle task outputs

Task nodes pass all outputs to downstream nodes, but there's a critical limitation to understand.

**Warning:** If multiple executions produce outputs with identical filenames, only one file (chosen randomly) passes to the next node.

#### Best practices for output naming

**Avoid:** All executions saving `model.pkl`\
**Better:** Include parameters in filename: `model_lr0.01_batch32.pkl`

**Python example with valohai-utils:**

```python
import valohai

# Get parameter values
lr = valohai.parameters("learning_rate").value
batch_size = valohai.parameters("batch_size").value

# Create unique output filename
model_filename = f"model_lr{lr}_batch{batch_size}.pkl"
valohai.outputs().live(model_filename)
```

### Create task nodes in the UI

Convert any execution node with parameters into a task node directly in the pipeline builder:

1. Open your project's **Pipelines** tab
2. Click **Create Pipeline**
3. Select your pipeline blueprint
4. Click on any node that has parameters
5. Click **Convert to task** below the graph
6. Configure your parameter grid in the **Parameters** section
7. Click **Create pipeline**


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.valohai.com/pipelines/parallel-runs-in-a-pipeline/parallel-executions-in-pipeline.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
