Pipelines automate your machine learning operations on Valohai ecosystem.

See also

You can read more about the reasoning behind general pipeline concepts like graphs, nodes and edges on the Pipelines core concepts page.

pipeline definition has 3 required properties:

  • name: name for the pipeline
  • nodes: list of all nodes (executions) in the pipeline
  • edges: list of all edges (requirements) between the nodes

A simple pipeline could look something like this:

# define "generate-dataset" and "train-model" steps above...
- pipeline:
    name: simple-pipeline
      - name: generate
        type: execution
        step: generate-dataset
      - name: train
        type: execution
        step: train-model
      - [generate.output.images*, train.input.dataset-images]
      - [generate.output.labels*, train.input.dataset-labels]

Here we have a pipeline with 2 nodes, and the second node train will wait its inputs to be generated by generate node. All files in /valohai/outputs that start with either images or labels will be passed between the executions.