# valohai.yaml Reference

This section provides comprehensive reference documentation for `valohai.yaml` configuration options.

The `valohai.yaml` file is the configuration blueprint for your machine learning workflows in Valohai. While the [valohai.yaml Overview](https://github.com/valohai/dokuhai/blob/main/docs/valohai.yaml-overview) provides an introduction and examples, this reference section documents all available configuration options in detail.

## Top-level Properties

* [`endpoint`](#endpoint)
* [`pipeline`](#pipeline)
* [`step`](#step)
* [`task`](#task)

## Property Details

### `endpoint`

* **type:** *object*
* **properties:**
  * `description` *(string)*
  * `files` *(array)*\
    ➤ Files that will be loaded into the image, for example the trained model.
    * array items
      * [`file-item`](#file-item)
  * `image` *(string)*\
    ➤ Docker image used to run the deployment code in.
  * `name` *(string)*\
    ➤ Name of the endpoint, which is used as part of the URL (i.e. /project/version/name) of the deployed endpoint. Endpoint names must be lowercase, start with a letter and may contain letters, numbers and dashes thereafter.
    * pattern: `^[a-z][a-z0-9-]+$`
  * `node-selector` *(string)*\
    ➤ Kubernetes node selector specifies which nodes the deployment should be scheduled to.
    * pattern: `=`
  * `port` *(number)*\
    ➤ Port used for the HTTP server. Defaults to 8000.
    * maximum: 65535
    * minimum: 1025
  * `resources` *(object)*\
    ➤ Resource requirements and limits for the endpoint.
    * additionalProperties: *False*
    * object properties
      * `cpu`
        * [`workflow-resource-cpu`](#workflow-resource-cpu)
      * `devices`
        * [`workflow-resource-devices`](#workflow-resource-devices)
      * `memory`
        * [`workflow-resource-memory`](#workflow-resource-memory)
  * `server-command`\
    ➤ Command that runs a HTTP server.
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*
  * `tolerations` *(array)*\
    ➤ List of Kubernetes tolerations specifying which node taints the endpoint should tolerate.
    * array items
      * [`toleration`](#toleration)
  * `wsgi` *(string)*\
    ➤ Specifies the WSGI application to serve (e.g. a Flask application). Specify the module (i.e. `package.app`) or the module and the WSGI callable (i.e. `package.app:wsgi_callable`).
* **required properties:** `image`, `name`

### `pipeline`

* **type:** *object*
* **properties:**
  * `edges` *(array)*
    * array items
      * anyOf
        * \*array – ShorthandEdge\
          ➤ A shorthand edge definition as a two-item array: \[source, target]. Source and target format: `node-name.edge-type.edge-key`.\
          Allowed edge types: parameter | file | dependency | output | environment-variable | metadata | input.\
          Example: `[train-model.output.model.pkl, test-model.input.model]` \*
          * items: False
          * prefixItems
            * *string*
            * *string*
        * *object – FullEdge*\
          \&#xNAN;*➤ A full edge definition as an object with source, target and optional additional configuration. Same source and target format as in the shorthand edge definition.*
          * additionalProperties: *False*
          * `properties`
            * `configuration` *(object)*
            * `source` *(string)*
            * `target` *(string)*
          * required properties: `source`, `target`
  * `name` *(string)*
  * `nodes` *(array)*
    * array items
      * anyOf
        * [*`execution-node`*](#execution-node)
        * [*`deployment-node`*](#deployment-node)
        * [*`task-node`*](#task-node)
  * `parameters` *(array)*
    * array items
      * [`pipeline-param`](#pipeline-param)
  * `reuse-executions` *(boolean)*\
    ➤ Set to true to allow Valohai to automatically detect if the executions insteps' nodes could be skipped and reuse the result from a previously successfully run node. Node's execution is considered unchanged when the step's data doesn't change (inputs, command, commit hash, parameters, step's configuration).
    * default: False
* **required properties:** `edges`, `name`, `nodes`

### `step`

* **type:** *object*
* **properties:**
  * `category` *(string)*\
    ➤ Category name to group & organize steps in the UI
  * `command`\
    ➤ The command or commands to run.
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*
  * `description` *(string)*\
    ➤ Describes the step. This is shown as a help text in the user interface.
  * `environment` *(string)*\
    ➤ Default execution environment (ID or substring of environment name)
  * `environment-variable-groups` *(array)*\
    ➤ Groups of environment variables to include in the step execution.
    * array items *(string)*
  * `environment-variables` *(array)*\
    ➤ Environment variables and their default values for the step.
    * array items
      * [`environment-variable-item`](#environment-variable-item)
  * `icon` *(string)*\
    ➤ URL to the icon representing the step
  * `image` *(string)*\
    ➤ The Docker image to use for this step.
  * `inputs` *(array)*\
    ➤ The inputs expected by the step.
    * array items
      * [`input-item`](#input-item)
  * `mounts` *(array)*\
    ➤ Custom mounts to enable, for private environments allowing them.
    * array items
      * [`mount-item`](#mount-item)
  * `cache-volumes` *(array)*\
    ➤ Kubernetes persistent volume claim names to used as additional input cache volumes.
    * array items *(string)*
  * `name` *(string)*\
    ➤ The unique name for this step.
  * `no-output-timeout`\
    ➤ The time after which the step is considered to have died if no output has been generated (in seconds or as a string (e.g. "1h 30m 5s")). An unspecified value means "platform default".
    * oneOf
      * *integer*
      * *string*
  * `outputs` *(array)*\
    ➤ The outputs expected to be generated by the step.
    * array items
      * [`output-item`](#output-item)
  * `parameters` *(array)*\
    ➤ The parameter definitions and default values to be interpolated into the command.
    * array items
      * [`param-item`](#param-item)
  * `resources` *(object)*\
    ➤ Resource requirements and limits for the workflow.
    * additionalProperties: *False*
    * object properties
      * `cpu`
        * [`workflow-resource-cpu`](#workflow-resource-cpu)
      * `devices`
        * [`workflow-resource-devices`](#workflow-resource-devices)
      * `memory`
        * [`workflow-resource-memory`](#workflow-resource-memory)
  * `runtime-config-preset` *(string)*\
    ➤ The runtime configuration preset ID or slug to use for this step.
  * `source-path` *(string)*\
    ➤ The original source file that this step comes from, relative to this config file.
  * `stop-condition` *(string)*
  * `time-limit`\
    ➤ The time limit for the step (in seconds or as a string, e.g. "1h 30m 5s"). An unspecified value means "no timeout".
    * oneOf
      * *integer*
      * *string*
  * `upload-store` *(string)*\
    ➤ The output data store name or UUID.
    * maxLength: 64
* **required properties:** `command`, `image`, `name`

### `task`

* **type:** *object*
* **properties:**
  * `description` *(string)*\
    ➤ Describes the task. This is shown as a help text in the user interface.
  * `engine` *(string)*
  * `execution-batch-size` *(integer)*
  * `execution-count` *(integer)*
  * `maximum-queued-executions` *(integer)*
  * `name` *(string)*\
    ➤ The unique name for this task.
  * `on-child-error` *(string)*
  * `optimization-target-metric` *(string)*
  * `optimization-target-value` *(number)*
  * `parameter-sets` *(array)*\
    ➤ Parameter sets for manual search mode.
    * array items *(object)*
  * `parameters` *(array)*\
    ➤ The variant parameter definitions and default values to be interpolated into the command.
    * array items
      * [`variant-param`](#variant-param)
  * `reuse-children` *(boolean)*\
    ➤ Whether to allow automatic reuse of previous successful executions.
    * default: False
  * `step` *(string)*\
    ➤ The step to run with.
  * `stop-condition` *(string)*
  * `type` *(string)*
* **required properties:** `name`, `step`

### `deployment`

Optional deployment definitions with defaults for initializing deployments automatically when a related pipeline is run and no matching deployment is found in the project.

* **type:** *object*
* **properties:**
  * `name` *(string)*\
    ➤ The unique, URL-friendly name of the deployment.
  * `defaults` *(object)* – DeploymentDefaults\
    ➤ Values to use if the deployment does not yet exist in the project context.
    * additionalProperties: *True*
    * object properties
      * `target` *(string)*\
        ➤ Deployment target identifier or UUID.
    * required properties: `target`
* **required properties:** `name`

### `deployment-node`

* **type:** *object*
* **properties:**
  * `actions` *(array)*
    * array items
      * [`node-action`](#node-action)
  * `aliases` *(array)*
    * array items *(string)*
  * `deployment` *(string)*
  * `endpoints` *(array)*
    * array items *(string)*
  * `name` *(string)*
  * `on-error` *(string)*
    * allowed values: `continue` | `stop-all` | `stop-next`
  * `type` *(string)*
    * const: `deployment`
* **required properties:** `deployment`, `name`, `type`

### `environment-variable-item`

* **type:** *object*
* **properties:**
  * `default` *(string)*\
    ➤ The default value for the environment variable.
  * `description` *(string)*\
    ➤ Describes the environment variable's meaning. This can be shown as a help text in the user interface.
  * `name` *(string)*\
    ➤ The environment variable. This must be unique within a Step.
  * `optional` *(boolean)*\
    ➤ Whether this environment variable is optional. All environment variables are optional by default. Executions may not be created unless all required environment variables have a value, configured either explicitly in the step or trickled down from project-level settings.
    * default: True
* **required properties:** `name`

### `execution-node`

* **type:** *object*
* **properties:**
  * `actions` *(array)*
    * array items
      * [`node-action`](#node-action)
  * `commit` *(string)*
  * `edge-merge-mode`
    * [`node-edge-merge-mode`](#node-edge-merge-mode)
  * `name` *(string)*
  * `on-error` *(string)*
    * allowed values: `continue` | `retry` | `stop-all` | `stop-next`
  * `override`
    * [`overridden-properties`](#overridden-properties)
  * `step` *(string)*
  * `type` *(string)*
    * const: `execution`
* **required properties:** `name`, `step`, `type`

### `file-item`

* **type:** *object*
* **properties:**
  * `description` *(string)*\
    ➤ The description of the file.
  * `name` *(string)*\
    ➤ The symbolic name of this file.
  * `path` *(string)*\
    ➤ Relative path of the file from the endpoint's working directory. For example "data/serialized\_model.hdf5".
* **required properties:** `name`, `path`

### `input-item`

* **type:** *object*
* **properties:**
  * `default`\
    ➤ The default URL(s) for the input.
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*
  * `description` *(string)*\
    ➤ Describes the input. This is shown as a help text in the user interface.
  * `download` *(string)*\
    ➤ Select downloading intention for this input. On-demand inputs are not downloaded before the step can run, instead are available to download using an authenticated download URL.
    * default: always
    * allowed values: `always` | `on-demand`
  * `filename` *(string)*\
    ➤ Force a filename for this input. Will not have an effect if there are multiple files for this input.
  * `keep-directories`\
    ➤ Whether to retain directories when using wildcards for this input.
    * default: False
    * oneOf
      * *boolean*
      * *string*
        * allowed values: `full` | `none` | `suffix`
  * `name` *(string)*\
    ➤ The symbolic name of this input. This must be unique within a Step.
    * maxLength: 64
  * `optional` *(boolean)*\
    ➤ Whether this input is optional. Optional inputs may be disabled in the user interface.
    * default: False
* **required properties:** `name`

### `mount-item`

* **type:** *anyOf*
* **properties:**
  * anyOf
    * *string*\
      \&#xNAN;*➤ A simple source:destination mount string. Note both parts must be absolute.*
      * pattern: `^/.+:/.+$`
    * *object*
      * additionalProperties: *False*
      * `properties`
        * `destination` *(string)*\
          ➤ Container path for data
          * pattern: `^/.+$`
        * `options` *(object)*\
          ➤ Additional mount options
        * `readonly` *(boolean)*\
          ➤ Whether to mount the volume read-only
          * default: False
        * `source` *(string)*\
          ➤ Host path for data (for local sources; for non-local mount types, depends)
        * `type` *(string)*\
          ➤ Mount type
      * required properties: `destination`, `source`

### `node-action`

* **type:** *object*
* **properties:**
  * `if`
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*
  * `then`
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*
  * `when`
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*
* **required properties:** `then`, `when`

### `node-edge-merge-mode`

Whether to replace the values set for this input, or to append to them when overridden by e.g. a pipeline edge.

* **type:** *string*

### `output-item`

* **type:** *object*
* **properties:**
  * `description` *(string)*\
    ➤ Describes the output.
  * `name` *(string)*\
    ➤ The symbolic name of this output. This must be unique within a Step.
* **required properties:** `name`

### `overridden-properties`

* **type:** *object*
* **properties:**
  * `command`\
    ➤ The command or commands to run.
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*
  * `environment` *(string)*\
    ➤ Default execution environment (ID or substring of environment name)
  * `environment-variables` *(array)*\
    ➤ Environment variables and their default values for the node.
    * array items
      * [`environment-variable-item`](#environment-variable-item)
  * `image` *(string)*\
    ➤ The Docker image to use for this node.
  * `inputs` *(array)*\
    ➤ The inputs expected by the node.
    * array items
      * [`input-item`](#input-item)
  * `mounts` *(array)*\
    ➤ Custom mounts to enable, for private environments allowing them.
    * array items
      * [`mount-item`](#mount-item)
  * `parameters` *(array)*\
    ➤ The parameter definitions and default values to be interpolated into the command.
    * array items
      * [`param-item`](#param-item)

### `param-item`

* **type:** *object*
* **properties:**
  * `category` *(string)*\
    ➤ Free-form category for the parameter. Shown in the user interface.
  * `choices` *(array)*\
    ➤ If choices are set, the user may only choose from these choices for the parameter. Not applicable to flag parameters.
  * `default`\
    ➤ The default value for the parameter.
  * `description` *(string)*\
    ➤ Describes the parameter. This is shown as a help text in the user interface.
  * `max` *(number)*\
    ➤ The maximum accepted value for the parameter. Only applicable to integer/float parameters.
  * `min` *(number)*\
    ➤ The minimum accepted value for the parameter. Only applicable to integer/float parameters.
  * `multiple` *(string)*\
    ➤ Whether to allow multiple values for this parameter, and how to pass them on the command line. Parameters set to `repeat` will be repeated using the `--pass-as` pattern, while `separate` will interpolate all values into a single `--pass-as` value placeholder using the `multiple-separator`. `multiple` is not available for flag parameters.
    * default: none
    * allowed values: `none` | `repeat` | `separate`
  * `multiple-separator` *(string)*\
    ➤ The separator to use when interpolating multiple values into a command parameter. Defaults to a comma.
    * default: ,
  * `name` *(string)*\
    ➤ The symbolic name of this parameter. This must be unique within a Step.
  * `optional` *(boolean)*\
    ➤ Whether this parameter is optional. Optional parameters may be disabled in the user interface. Not applicable to flag parameters; they're always optional in a sense.
    * default: False
  * `pass-as` *(string)*\
    ➤ How to pass the parameter to the command.Defaults to `--{name}={value}` (or `--{name}` for flags).
    * default: --{name}={value}
  * `pass-false-as` *(string)*\
    ➤ How to pass flag (boolean) parameters when the flag is false.If either pass-true-as or pass-false-as are set, pass-as is not used.
  * `pass-true-as` *(string)*\
    ➤ How to pass flag (boolean) parameters when the flag is true.If either pass-true-as or pass-false-as are set, pass-as is not used.
  * `type` *(string)*\
    ➤ The type of the parameter.
    * default: string
    * allowed values: `flag` | `float` | `integer` | `string`
  * `widget`\
    ➤ UI widget used for editing parameter values
    * oneOf
      * *object*
        * `properties`
          * `settings` *(object)*
          * `type` *(string)*
        * required properties: `type`
      * *string*
* **required properties:** `name`

### `pipeline-param`

* **type:** *object*
* **properties:**
  * `category` *(string)*\
    ➤ Free-form category for the parameter. Shown in the user interface.
  * `default`\
    ➤ The default value for the pipeline parameter.
  * `description` *(string)*\
    ➤ Describes the parameter. This is shown as a help text in the user interface.
  * `name` *(string)*
  * `target` *(string)*
  * `targets`
    * oneOf
      * *string*
      * *array*
        * `items` *(string)*

### `task-node`

* **type:** *object*
* **properties:**
  * `actions` *(array)*
    * array items
      * [`node-action`](#node-action)
  * `commit` *(string)*
  * `edge-merge-mode`
    * [`node-edge-merge-mode`](#node-edge-merge-mode)
  * `name` *(string)*
  * `on-error` *(string)*
    * allowed values: `continue` | `retry` | `stop-all` | `stop-next`
  * `override`
    * [`overridden-properties`](#overridden-properties)
  * `step` *(string)*
  * `type` *(string)*
    * const: `task`
  * `task` *(string)*
* **required properties:** `name`, `type`

### `toleration`

* **type:** *object*
* **properties:**
  * `effect` *(string)*\
    ➤ Optional, empty tolerates all effects.
    * allowed values: `NoExecute` | `NoSchedule` | `PreferNoSchedule`
  * `key` *(string)*\
    ➤ Optional if operator is "Exists", to match all keys.
  * `operator` *(string)*\
    ➤ Optional, empty means "Equal".
    * allowed values: `Equal` | `Exists`
  * `value` *(string)*\
    ➤ Optional as empty in some contexts like when operator is "Exists".

### `variant-param`

* **type:** *object*
* **properties:**
  * `name` *(string)*
  * `rules`
    * [`variant-param-rule-item`](#variant-param-rule-item)
  * `style` *(string)*
    * allowed values: `linear` | `logspace` | `multiple` | `random` | `single`
* **required properties:** `name`, `rules`, `style`

### `variant-param-rule-item`

* **type:** *object*
* **properties:**
  * `base` *(number)*
  * `count` *(number)*
  * `distribution` *(string)*
  * `end` *(number)*
  * `integerify` *(boolean)*
  * `items` *(array)*
    * array items *(number)*
  * `max` *(number)*
  * `min` *(number)*
  * `numberify` *(boolean)*
  * `seed` *(number)*
  * `start` *(number)*
  * `step` *(number)*
  * `value` *(number)*

### `workflow-resource-cpu`

CPU resource requirements and limits in vCPU counts e.g. 0.1 is 10% of one CPU.

* **type:** *object*
* **properties:**
  * `max` *(number)*
  * `min` *(number)*

### `workflow-resource-devices`

Devices required e.g. 'nvidia.com/gpu: 1' for one NVIDIA GPU.

* **type:** *object*

### `workflow-resource-memory`

Memory requirements and limits in MB e.g. 100 is 100 MB.

* **type:** *object*
* **properties:**
  * `max` *(number)*
  * `min` *(number)*

## Related Documentation

* [valohai.yaml Overview](https://github.com/valohai/dokuhai/blob/main/docs/valohai.yaml-overview) - Introduction and basic examples
* [Using Parameters](https://docs.valohai.com/getting-started/intro/parameters) - Tutorial on working with parameters
* [How-to: Manage YAML](https://docs.valohai.com/valohai.yaml-overview/multiple-files) - Practical guides for YAML management
