# valohai.yaml Reference

This section provides comprehensive reference documentation for `valohai.yaml` configuration options.

The `valohai.yaml` file is the configuration blueprint for your machine learning workflows in Valohai. While the [valohai.yaml Overview](https://github.com/valohai/dokuhai/blob/main/docs/valohai.yaml-overview) provides an introduction and examples, this reference section documents all available configuration options in detail.

## Top-level Properties

* [`endpoint`](#endpoint)
* [`pipeline`](#pipeline)
* [`step`](#step)
* [`task`](#task)

## Property Details

### `endpoint`

* **type:** *object*
* **properties:**
  * `description` *(string)*
  * `files` *(array)*\
    ➤ Files that will be loaded into the image, for example the trained model.
    * array items
      * [`file-item`](#file-item)
  * `image` *(string)*\
    ➤ Docker image used to run the deployment code in.
  * `name` *(string)*\
    ➤ Name of the endpoint, which is used as part of the URL (i.e. /project/version/name) of the deployed endpoint. Endpoint names must be lowercase, start with a letter and may contain letters, numbers and dashes thereafter.
    * pattern: `^[a-z][a-z0-9-]+$`
  * `node-selector` *(string)*\
    ➤ Kubernetes node selector specifies which nodes the deployment should be scheduled to.
    * pattern: `=`
  * `port` *(number)*\
    ➤ Port used for the HTTP server. Defaults to 8000.
    * maximum: 65535
    * minimum: 1025
  * `resources` *(object)*\
    ➤ Resource requirements and limits for the endpoint.
    * additionalProperties: *False*
    * object properties
      * `cpu`
        * [`workflow-resource-cpu`](#workflow-resource-cpu)
      * `devices`
        * [`workflow-resource-devices`](#workflow-resource-devices)
      * `memory`
        * [`workflow-resource-memory`](#workflow-resource-memory)
  * `server-command`\
    ➤ Command that runs a HTTP server.
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*
  * `tolerations` *(array)*\
    ➤ List of Kubernetes tolerations specifying which node taints the endpoint should tolerate.
    * array items
      * [`toleration`](#toleration)
  * `wsgi` *(string)*\
    ➤ Specifies the WSGI application to serve (e.g. a Flask application). Specify the module (i.e. `package.app`) or the module and the WSGI callable (i.e. `package.app:wsgi_callable`).
* **required properties:** `image`, `name`

### `pipeline`

* **type:** *object*
* **properties:**
  * `edges` *(array)*
    * array items
      * anyOf
        * \*array – ShorthandEdge\
          ➤ A shorthand edge definition as a two-item array: \[source, target]. Source and target format: `node-name.edge-type.edge-key`.\
          Allowed edge types: parameter | file | dependency | output | environment-variable | metadata | input.\
          Example: `[train-model.output.model.pkl, test-model.input.model]` \*
          * items: False
          * prefixItems
            * *string*
            * *string*
        * *object – FullEdge*\
          \&#xNAN;*➤ A full edge definition as an object with source, target and optional additional configuration. Same source and target format as in the shorthand edge definition.*
          * additionalProperties: *False*
          * `properties`
            * `configuration` *(object)*
            * `source` *(string)*
            * `target` *(string)*
          * required properties: `source`, `target`
  * `name` *(string)*
  * `nodes` *(array)*
    * array items
      * anyOf
        * [*`execution-node`*](#execution-node)
        * [*`deployment-node`*](#deployment-node)
        * [*`task-node`*](#task-node)
  * `parameters` *(array)*
    * array items
      * [`pipeline-param`](#pipeline-param)
  * `reuse-executions` *(boolean)*\
    ➤ Set to true to allow Valohai to automatically detect if the executions insteps' nodes could be skipped and reuse the result from a previously successfully run node. Node's execution is considered unchanged when the step's data doesn't change (inputs, command, commit hash, parameters, step's configuration).
    * default: False
* **required properties:** `edges`, `name`, `nodes`

### `step`

* **type:** *object*
* **properties:**
  * `category` *(string)*\
    ➤ Category name to group & organize steps in the UI
  * `command`\
    ➤ The command or commands to run.
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*
  * `description` *(string)*\
    ➤ Describes the step. This is shown as a help text in the user interface.
  * `environment` *(string)*\
    ➤ Default execution environment (ID or substring of environment name)
  * `environment-variable-groups` *(array)*\
    ➤ Groups of environment variables to include in the step execution.
    * array items *(string)*
  * `environment-variables` *(array)*\
    ➤ Environment variables and their default values for the step.
    * array items
      * [`environment-variable-item`](#environment-variable-item)
  * `icon` *(string)*\
    ➤ URL to the icon representing the step
  * `image` *(string)*\
    ➤ The Docker image to use for this step.
  * `inputs` *(array)*\
    ➤ The inputs expected by the step.
    * array items
      * [`input-item`](#input-item)
  * `mounts` *(array)*\
    ➤ Custom mounts to enable, for private environments allowing them.
    * array items
      * [`mount-item`](#mount-item)
  * `cache-volumes` *(array)*\
    ➤ Kubernetes persistent volume claim names to used as additional input cache volumes.
    * array items *(string)*
  * `name` *(string)*\
    ➤ The unique name for this step.
  * `no-output-timeout`\
    ➤ The time after which the step is considered to have died if no output has been generated (in seconds or as a string (e.g. "1h 30m 5s")). An unspecified value means "platform default".
    * oneOf
      * *integer*
      * *string*
  * `outputs` *(array)*\
    ➤ The outputs expected to be generated by the step.
    * array items
      * [`output-item`](#output-item)
  * `parameters` *(array)*\
    ➤ The parameter definitions and default values to be interpolated into the command.
    * array items
      * [`param-item`](#param-item)
  * `resources` *(object)*\
    ➤ Resource requirements and limits for the workflow.
    * additionalProperties: *False*
    * object properties
      * `cpu`
        * [`workflow-resource-cpu`](#workflow-resource-cpu)
      * `devices`
        * [`workflow-resource-devices`](#workflow-resource-devices)
      * `memory`
        * [`workflow-resource-memory`](#workflow-resource-memory)
  * `runtime-config-preset` *(string)*\
    ➤ The runtime configuration preset ID or slug to use for this step.
  * `source-path` *(string)*\
    ➤ The original source file that this step comes from, relative to this config file.
  * `stop-condition` *(string)*
  * `time-limit`\
    ➤ The time limit for the step (in seconds or as a string, e.g. "1h 30m 5s"). An unspecified value means "no timeout".
    * oneOf
      * *integer*
      * *string*
  * `upload-store` *(string)*\
    ➤ The output data store name or UUID.
    * maxLength: 64
* **required properties:** `command`, `image`, `name`

### `task`

* **type:** *object*
* **properties:**
  * `description` *(string)*\
    ➤ Describes the task. This is shown as a help text in the user interface.
  * `engine` *(string)*
  * `execution-batch-size` *(integer)*
  * `execution-count` *(integer)*
  * `maximum-queued-executions` *(integer)*
  * `name` *(string)*\
    ➤ The unique name for this task.
  * `on-child-error` *(string)*
  * `optimization-target-metric` *(string)*
  * `optimization-target-value` *(number)*
  * `parameter-sets` *(array)*\
    ➤ Parameter sets for manual search mode.
    * array items *(object)*
  * `parameters` *(array)*\
    ➤ The variant parameter definitions and default values to be interpolated into the command.
    * array items
      * [`variant-param`](#variant-param)
  * `reuse-children` *(boolean)*\
    ➤ Whether to allow automatic reuse of previous successful executions.
    * default: False
  * `step` *(string)*\
    ➤ The step to run with.
  * `stop-condition` *(string)*
  * `type` *(string)*
* **required properties:** `name`, `step`

### `deployment`

Optional deployment definitions with defaults for initializing deployments automatically when a related pipeline is run and no matching deployment is found in the project.

* **type:** *object*
* **properties:**
  * `name` *(string)*\
    ➤ The unique, URL-friendly name of the deployment.
  * `defaults` *(object)* – DeploymentDefaults\
    ➤ Values to use if the deployment does not yet exist in the project context.
    * additionalProperties: *True*
    * object properties
      * `target` *(string)*\
        ➤ Deployment target identifier or UUID.
    * required properties: `target`
* **required properties:** `name`

### `deployment-node`

* **type:** *object*
* **properties:**
  * `actions` *(array)*
    * array items
      * [`node-action`](#node-action)
  * `aliases` *(array)*
    * array items *(string)*
  * `deployment` *(string)*
  * `endpoints` *(array)*
    * array items *(string)*
  * `name` *(string)*
  * `on-error` *(string)*
    * allowed values: `continue` | `stop-all` | `stop-next`
  * `type` *(string)*
    * const: `deployment`
* **required properties:** `deployment`, `name`, `type`

### `environment-variable-item`

* **type:** *object*
* **properties:**
  * `default` *(string)*\
    ➤ The default value for the environment variable.
  * `description` *(string)*\
    ➤ Describes the environment variable's meaning. This can be shown as a help text in the user interface.
  * `name` *(string)*\
    ➤ The environment variable. This must be unique within a Step.
  * `optional` *(boolean)*\
    ➤ Whether this environment variable is optional. All environment variables are optional by default. Executions may not be created unless all required environment variables have a value, configured either explicitly in the step or trickled down from project-level settings.
    * default: True
* **required properties:** `name`

### `execution-node`

* **type:** *object*
* **properties:**
  * `actions` *(array)*
    * array items
      * [`node-action`](#node-action)
  * `commit` *(string)*
  * `edge-merge-mode`
    * [`node-edge-merge-mode`](#node-edge-merge-mode)
  * `name` *(string)*
  * `on-error` *(string)*
    * allowed values: `continue` | `retry` | `stop-all` | `stop-next`
  * `override`
    * [`overridden-properties`](#overridden-properties)
  * `step` *(string)*
  * `type` *(string)*
    * const: `execution`
* **required properties:** `name`, `step`, `type`

### `file-item`

* **type:** *object*
* **properties:**
  * `description` *(string)*\
    ➤ The description of the file.
  * `name` *(string)*\
    ➤ The symbolic name of this file.
  * `path` *(string)*\
    ➤ Relative path of the file from the endpoint's working directory. For example "data/serialized\_model.hdf5".
* **required properties:** `name`, `path`

### `input-item`

* **type:** *object*
* **properties:**
  * `default`\
    ➤ The default URL(s) for the input.
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*
  * `description` *(string)*\
    ➤ Describes the input. This is shown as a help text in the user interface.
  * `download` *(string)*\
    ➤ Select downloading intention for this input. On-demand inputs are not downloaded before the step can run, instead are available to download using an authenticated download URL.
    * default: always
    * allowed values: `always` | `on-demand`
  * `filename` *(string)*\
    ➤ Force a filename for this input. Will not have an effect if there are multiple files for this input.
  * `keep-directories`\
    ➤ Whether to retain directories when using wildcards for this input.
    * default: False
    * oneOf
      * *boolean*
      * *string*
        * allowed values: `full` | `none` | `suffix`
  * `name` *(string)*\
    ➤ The symbolic name of this input. This must be unique within a Step.
    * maxLength: 64
  * `optional` *(boolean)*\
    ➤ Whether this input is optional. Optional inputs may be disabled in the user interface.
    * default: False
* **required properties:** `name`

### `mount-item`

* **type:** *anyOf*
* **properties:**
  * anyOf
    * *string*\
      \&#xNAN;*➤ A simple source:destination mount string. Note both parts must be absolute.*
      * pattern: `^/.+:/.+$`
    * *object*
      * additionalProperties: *False*
      * `properties`
        * `destination` *(string)*\
          ➤ Container path for data
          * pattern: `^/.+$`
        * `options` *(object)*\
          ➤ Additional mount options
        * `readonly` *(boolean)*\
          ➤ Whether to mount the volume read-only
          * default: False
        * `source` *(string)*\
          ➤ Host path for data (for local sources; for non-local mount types, depends)
        * `type` *(string)*\
          ➤ Mount type
      * required properties: `destination`, `source`

### `node-action`

* **type:** *object*
* **properties:**
  * `if`
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*
  * `then`
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*
  * `when`
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*
* **required properties:** `then`, `when`

### `node-edge-merge-mode`

Whether to replace the values set for this input, or to append to them when overridden by e.g. a pipeline edge.

* **type:** *string*

### `output-item`

* **type:** *object*
* **properties:**
  * `description` *(string)*\
    ➤ Describes the output.
  * `name` *(string)*\
    ➤ The symbolic name of this output. This must be unique within a Step.
* **required properties:** `name`

### `overridden-properties`

* **type:** *object*
* **properties:**
  * `command`\
    ➤ The command or commands to run.
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*
  * `environment` *(string)*\
    ➤ Default execution environment (ID or substring of environment name)
  * `environment-variables` *(array)*\
    ➤ Environment variables and their default values for the node.
    * array items
      * [`environment-variable-item`](#environment-variable-item)
  * `image` *(string)*\
    ➤ The Docker image to use for this node.
  * `inputs` *(array)*\
    ➤ The inputs expected by the node.
    * array items
      * [`input-item`](#input-item)
  * `mounts` *(array)*\
    ➤ Custom mounts to enable, for private environments allowing them.
    * array items
      * [`mount-item`](#mount-item)
  * `parameters` *(array)*\
    ➤ The parameter definitions and default values to be interpolated into the command.
    * array items
      * [`param-item`](#param-item)

### `param-item`

* **type:** *object*
* **properties:**
  * `category` *(string)*\
    ➤ Free-form category for the parameter. Shown in the user interface.
  * `choices` *(array)*\
    ➤ If choices are set, the user may only choose from these choices for the parameter. Not applicable to flag parameters.
  * `default`\
    ➤ The default value for the parameter.
  * `description` *(string)*\
    ➤ Describes the parameter. This is shown as a help text in the user interface.
  * `max` *(number)*\
    ➤ The maximum accepted value for the parameter. Only applicable to integer/float parameters.
  * `min` *(number)*\
    ➤ The minimum accepted value for the parameter. Only applicable to integer/float parameters.
  * `multiple` *(string)*\
    ➤ Whether to allow multiple values for this parameter, and how to pass them on the command line. Parameters set to `repeat` will be repeated using the `--pass-as` pattern, while `separate` will interpolate all values into a single `--pass-as` value placeholder using the `multiple-separator`. `multiple` is not available for flag parameters.
    * default: none
    * allowed values: `none` | `repeat` | `separate`
  * `multiple-separator` *(string)*\
    ➤ The separator to use when interpolating multiple values into a command parameter. Defaults to a comma.
    * default: ,
  * `name` *(string)*\
    ➤ The symbolic name of this parameter. This must be unique within a Step.
  * `optional` *(boolean)*\
    ➤ Whether this parameter is optional. Optional parameters may be disabled in the user interface. Not applicable to flag parameters; they're always optional in a sense.
    * default: False
  * `pass-as` *(string)*\
    ➤ How to pass the parameter to the command.Defaults to `--{name}={value}` (or `--{name}` for flags).
    * default: --{name}={value}
  * `pass-false-as` *(string)*\
    ➤ How to pass flag (boolean) parameters when the flag is false.If either pass-true-as or pass-false-as are set, pass-as is not used.
  * `pass-true-as` *(string)*\
    ➤ How to pass flag (boolean) parameters when the flag is true.If either pass-true-as or pass-false-as are set, pass-as is not used.
  * `type` *(string)*\
    ➤ The type of the parameter.
    * default: string
    * allowed values: `flag` | `float` | `integer` | `string`
  * `widget`\
    ➤ UI widget used for editing parameter values
    * oneOf
      * *object*
        * `properties`
          * `settings` *(object)*
          * `type` *(string)*
        * required properties: `type`
      * *string*
* **required properties:** `name`

### `pipeline-param`

* **type:** *object*
* **properties:**
  * `category` *(string)*\
    ➤ Free-form category for the parameter. Shown in the user interface.
  * `default`\
    ➤ The default value for the pipeline parameter.
  * `description` *(string)*\
    ➤ Describes the parameter. This is shown as a help text in the user interface.
  * `name` *(string)*
  * `target` *(string)*
  * `targets`
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*

### `task-node`

* **type:** *object*
* **properties:**
  * `actions` *(array)*
    * array items
      * [`node-action`](#node-action)
  * `commit` *(string)*
  * `edge-merge-mode`
    * [`node-edge-merge-mode`](#node-edge-merge-mode)
  * `name` *(string)*
  * `on-error` *(string)*
    * allowed values: `continue` | `retry` | `stop-all` | `stop-next`
  * `override`
    * [`overridden-properties`](#overridden-properties)
  * `step` *(string)*
  * `type` *(string)*
    * const: `task`
  * `task` *(string)*
* **required properties:** `name`, `type`

### `toleration`

* **type:** *object*
* **properties:**
  * `effect` *(string)*\
    ➤ Optional, empty tolerates all effects.
    * allowed values: `NoExecute` | `NoSchedule` | `PreferNoSchedule`
  * `key` *(string)*\
    ➤ Optional if operator is "Exists", to match all keys.
  * `operator` *(string)*\
    ➤ Optional, empty means "Equal".
    * allowed values: `Equal` | `Exists`
  * `value` *(string)*\
    ➤ Optional as empty in some contexts like when operator is "Exists".

### `variant-param`

* **type:** *object*
* **properties:**
  * `name` *(string)*
  * `rules`
    * [`variant-param-rule-item`](#variant-param-rule-item)
  * `style` *(string)*
    * allowed values: `linear` | `logspace` | `multiple` | `random` | `single`
* **required properties:** `name`, `rules`, `style`

### `variant-param-rule-item`

* **type:** *object*
* **properties:**
  * `base` *(number)*
  * `count` *(number)*
  * `distribution` *(string)*
  * `end` *(number)*
  * `integerify` *(boolean)*
  * `items` *(array)*
    * array items *(number)*
  * `max` *(number)*
  * `min` *(number)*
  * `numberify` *(boolean)*
  * `seed` *(number)*
  * `start` *(number)*
  * `step` *(number)*
  * `value` *(number)*

### `workflow-resource-cpu`

CPU resource requirements and limits in vCPU counts e.g. 0.1 is 10% of one CPU.

* **type:** *object*
* **properties:**
  * `max` *(number)*
  * `min` *(number)*

### `workflow-resource-devices`

Devices required e.g. 'nvidia.com/gpu: 1' for one NVIDIA GPU.

* **type:** *object*

### `workflow-resource-memory`

Memory requirements and limits in MB e.g. 100 is 100 MB.

* **type:** *object*
* **properties:**
  * `max` *(number)*
  * `min` *(number)*

## Related Documentation

* [valohai.yaml Overview](https://github.com/valohai/dokuhai/blob/main/docs/valohai.yaml-overview) - Introduction and basic examples
* [Using Parameters](/getting-started/intro/parameters.md) - Tutorial on working with parameters
* [How-to: Manage YAML](/valohai.yaml-overview/multiple-files.md) - Practical guides for YAML management


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.valohai.com/valohai-yaml.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
