Managing Large YAML Files
As your project grows to 30+ steps, your valohai.yaml can become repetitive and hard to maintain.
YAML anchors and aliases let you define reusable blocks once and reference them everywhere, keeping your config clean and consistent.
Why This Matters
Reduce duplication: Define common inputs, parameters, or commands once instead of copying them across dozens of steps.
Easier updates: Change a dataset path in one place, and it updates everywhere that references it.
Better readability: A 500-line YAML with anchors is easier to scan than a 2000-line file with repetition.
YAML Anchors & Aliases: The Basics
Define a reusable block with &anchor
&anchor- definitions:
my-common-inputs: &common_inputs # <- Anchor named "common_inputs"
- name: dataset
default: s3://my-bucket/train.csv
- name: config
default: s3://my-bucket/config.yaml
- step:
name: train-model
image: tensorflow/tensorflow:2.6.0
command: python train.py
inputs: *common_inputs # Uses the block defined above
- step:
name: evaluate-model
image: tensorflow/tensorflow:2.6.0
command: python evaluate.py
inputs: *common_inputs # Same inputs, no repetitionBoth steps now share the same input definitions. Update &common_inputs once, and both steps inherit the change.
Common Use Cases
Shared input datasets
Repeated parameters
Standard commands
Merge and Override with <<: *anchor
<<: *anchorYou can merge an anchor and add extra fields:
This keeps the epochs parameter from &base and adds debug_mode.
Tips for Large YAML Files
Define anchors at the top: Keep all reusable blocks in a definitions section at the start of your file for easy reference.
Use descriptive anchor names: &training_params is clearer than ¶ms1.
Don't over-anchor: If a block is only used once, don't create an anchor. They're for repeated content.
Lint regularly: Run vh lint after editing anchors to catch syntax mistakes
What's Next?
Generate YAML with valohai-utils to skip writing YAML by hand (Python users)
Validate YAML with the linter to catch anchor syntax errors
Multiple YAML files for monorepo organization
Last updated
Was this helpful?
