# valohai.yaml Overview

The `valohai.yaml` file serves as a configuration blueprint for your machine learning experiments, allowing you to define all the necessary jobs and their parameters and dependencies in a structured manner. This file is typically stored in your project’s repository, making it easy to version and reproduce experiments.

> 💡 **Tip:** Instead of writing YAML by hand, Python users can use valohai-utils to define Valohai steps in their code. Check the [Generate YAML with valohai-utils](/valohai.yaml-overview/generate-from-python.md) section for more information.

Here’s a brief overview of what a `valohai.yaml` file typically contains:

1. **Name**: A user-friendly label for a step, pipeline, or deployment. Users trigger specific steps by mentioning their names.
2. **Environment**: Specifies the default execution environment for jobs. It can refer to a set of cloud-based or on-premises machines, or a combination of both.
3. **Image**: The Docker image that serves as the foundational environment for your step. This image contains essential software, libraries, and packages needed to run your code. For example, it might include Python 3.9, PyTorch, and various Python packages. The Docker image shouldn’t contain your own code or data.
4. **Command**: You can specify the command to execute within the chosen environment. Typically, this command runs your machine learning training script or other tasks. You can define one or multiple commands (e.g., `python train.py`, `mkdir test`, `apt-get install`, `pip install`, etc.).
5. **Inputs**: Describes the necessary inputs for your experiment, such as datasets, models, or other files. Valohai downloads and caches these inputs, making them accessible in your code as if they were local files.
6. **Parameters**: Enables you to set hyperparameters and other configurable settings for your experiment. These parameters can be easily adjusted during experimentation, allowing you to modify job configurations or perform hyperparameter tuning.

### Example valohai.yaml <a href="#id-1-example-valohai-yaml" id="id-1-example-valohai-yaml"></a>

```yaml
- step:
    name: preprocess-dataset
    image: python:3.9
    command:
      - pip install numpy valohai-utils
      - python ./preprocess_dataset.py
    inputs:
      - name: dataset
        default: https://valohaidemo.blob.core.windows.net/mnist/mnist.npz

- step:
    name: train-model
    image: tensorflow/tensorflow:2.6.0
    command:
      - pip install valohai-utils
      - python ./train_model.py {parameters}
    parameters:
      - name: epochs
        default: 5
        type: integer
      - name: learning_rate
        default: 0.001
        type: float
    inputs:
      - name: dataset
        default: https://valohaidemo.blob.core.windows.net/mnist/preprocessed_mnist.npz

- step:
    name: batch-inference
    image: tensorflow/tensorflow:2.6.0
    command:
    - pip install pillow valohai-utils
    - python ./batch_inference.py
    inputs:
    - name: model
    - name: images
      default:
      - https://valohaidemo.blob.core.windows.net/mnist/four-inverted.png
      - https://valohaidemo.blob.core.windows.net/mnist/five-inverted.png
      - https://valohaidemo.blob.core.windows.net/mnist/five-normal.jpg

- pipeline:
    name: training-pipeline
    nodes:
      - name: preprocess
        type: execution
        step: preprocess-dataset
      - name: train
        type: execution
        step: train-model
        override:
          inputs:
            - name: dataset
      - name: evaluate
        type: execution
        step: batch-inference
    edges:
      - [preprocess.output.preprocessed_mnist.npz, train.input.dataset]
      - [train.output.model*, evaluate.input.model]
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.valohai.com/valohai.yaml-overview.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
