Generate YAML with valohai-utils
Define Valohai steps in Python and generate valohai.yaml automatically
If you prefer defining ML workflows in Python instead of writing YAML by hand, the Python helper tool valohai-utils lets you generate valohai.yaml from your code.
This is optional. Many users write YAML directly to keep their code free of Valohai dependencies.
Why Generate YAML from Python?
Familiar syntax: If you're more comfortable with Python than YAML, this approach feels more natural.
Type safety: Python editors provide autocomplete and type checking, catching errors before execution.
Programmatic generation: Build YAML dynamically based on conditions, loops, or external configs.
How It Works
Install valohai-utils:
pip install valohai-utilsDefine a step in your Python script:
# train.py
import valohai
# Define parameters
params = {
"epochs": 10,
"learning_rate": 0.001,
}
# Define inputs
inputs = {
"dataset": "s3://my-bucket/train.csv"
}
valohai.prepare(step="train", image="python:3.12", default_parameters=params, default_inputs=inputs)
# Your training code
print(f"Training with lr={lr} for {epochs} epochs")
print(f"Dataset: {dataset}")Generate the YAML:
vh yaml step train.pyThis creates the following valohai.yaml file:
- step:
name: train
image: python:3.12
command: python train.py {parameters}
parameters:
- name: learning_rate
default: 0.001
type: float
- name: epochs
default: 10
type: integer
inputs:
- name: dataset
default: s3://my-bucket/train.csvWhen to Use This Approach
You're Python-first: Your team is more comfortable with Python than YAML syntax.
Dynamic workflows: You need to generate steps programmatically based on runtime conditions.
Rapid prototyping: You want to define and test steps quickly without switching between files.
When NOT to Use This Approach
Keep code clean: If you want your ML code to remain framework-agnostic, write YAML by hand.
Team collaboration: Non-Python users may find YAML easier to read and edit.
Complex pipelines: Large multi-step pipelines are often clearer in YAML than generated from Python.
What's Next?
Validate your YAML with the linter
Multiple YAML files for monorepo projects
Manage large YAML files with anchors
Last updated
Was this helpful?
