Run a Pipeline
This guide covers the three ways to create and run pipelines in Valohai: through the web interface, command-line, or API.
Prerequisites
Before creating a pipeline, ensure you have:
Individual steps defined and tested in your
valohai.yaml
Define a pipeline in valohai.yaml
Pipelines are defined in your valohai.yaml configuration file. Here's a complete example:
# Define individual steps first
- step:
name: preprocess-dataset
image: python:3.9
command:
- pip install numpy valohai-utils
- python ./preprocess_dataset.py
inputs:
- name: dataset
default: https://valohaidemo.blob.core.windows.net/mnist/mnist.npz
- step:
name: train-model
image: tensorflow/tensorflow:2.6.0
command:
- pip install valohai-utils
- python ./train_model.py {parameters}
parameters:
- name: epochs
default: 5
type: integer
- name: learning_rate
default: 0.001
type: float
inputs:
- name: dataset
default: https://valohaidemo.blob.core.windows.net/mnist/preprocessed_mnist.npz
- step:
name: batch-inference
image: tensorflow/tensorflow:2.6.0
command:
- pip install pillow valohai-utils
- python ./batch_inference.py
inputs:
- name: model
- name: images
default:
- https://valohaidemo.blob.core.windows.net/mnist/four-inverted.png
- https://valohaidemo.blob.core.windows.net/mnist/five-inverted.png
- https://valohaidemo.blob.core.windows.net/mnist/five-normal.jpg
# Define the pipeline structure
- pipeline:
name: training-pipeline
nodes:
- name: preprocess
type: execution
step: preprocess-dataset
- name: train
type: execution
step: train-model
override:
inputs:
- name: dataset # Replace default inputs with values from the edge
- name: evaluate
type: execution
step: batch-inference
edges:
- [preprocess.output.preprocessed_mnist.npz, train.input.dataset]
- [train.output.model*, evaluate.input.model]Key configuration elements
Nodes: Reference your existing steps and give them names within the pipeline context.
Edges: Define data flow between nodes using the format:
[source_node.output.filename, target_node.input.input_name]Use wildcards (
*) to pass multiple files matching a pattern
Override: Remove default values from inputs that will receive data via edges.
Create via web interface
Once your valohai.yaml with the pipeline definition is committed and Valohai has fetched the latest changes, you can create a pipeline run.
Navigate to your project
Click the Pipelines tab
Click Create pipeline
Select your pipeline blueprint from the dropdown (populated from
valohai.yaml)Review and modify the configuration:
Adjust parameters for any node
Change input URLs if needed
Select different environments or Docker images
Click Create pipeline to launch
💡 The web interface pre-fills configurations from your YAML but allows runtime overrides without changing code.
Create via command-line
Run a pipeline using the Valohai CLI:
# Run with current directory's valohai.yaml
vh pipeline run training-pipeline --adhoc
# Run with latest fetched commit in Valohai
vh pipeline run training-pipeline
CLI options
--adhoc: Use local uncommitted changes--commit: Specify a Git commit/branch/tag
Create via API
For programmatic pipeline creation, send a POST request to the pipelines endpoint.
Basic example
curl -X POST https://app.valohai.com/api/v0/pipelines/ \
-H "Authorization: Token YOUR_API_TOKEN" \
-H "Content-Type: application/json" \
-d @pipeline.jsonPipeline configuration format
The API expects a JSON payload with complete pipeline specification:
{
"project": "YOUR_PROJECT_ID",
"title": "training-pipeline",
"nodes": [
{
"name": "preprocess",
"type": "execution",
"template": {
"environment": "ENVIRONMENT_ID",
"commit": "main",
"step": "preprocess-dataset"
}
},
{
"name": "train",
"type": "execution",
"template": {
"commit": "main",
"step": "train-model",
"parameters": {
"epochs": 10,
"learning_rate": 0.001
}
},
"on_error": "stop-all"
}
],
"edges": [
{
"source_node": "preprocess",
"source_key": "preprocessed_mnist.npz",
"source_type": "output",
"target_node": "train",
"target_type": "input",
"target_key": "dataset"
}
]
}💡 Find your project ID and environment IDs via the web interface or API endpoints.
Pipeline execution behavior
Once created, pipelines execute automatically:
Nodes start when all their input edges are satisfied
Parallel execution occurs when dependencies allow
Failed nodes can be configured to stop the entire pipeline or allow others to continue
Next steps
Learn about reusing nodes to avoid re-running successful steps
Explore conditional execution for dynamic workflows
Set up scheduled pipelines for production automation
Last updated
Was this helpful?
