pipeline
¶
Pipelines automate your machine learning operations on Valohai ecosystem.
See also
You can read more about the reasoning behind general pipeline concepts like graphs, nodes and edges on the Pipelines core concepts page.
pipeline
definition has 3 required properties:
name
: name for the pipelinenodes
: list of all nodes (executions and deployments) in the pipelineedges
: list of all edges (requirements) between the nodes
A simple pipeline could look something like this:
---
- step:
name: generate-dataset
image: python:3.6
command: python preprocess.py
- step:
name: train-model
image: tensorflow/tensorflow:2.2.0-gpu
command: python train.py
inputs:
- name: dataset-images
default: http://...
- name: dataset-labels
default: http://...
- pipeline:
name: simple-pipeline
nodes:
- name: generate-node
type: execution
step: generate-dataset
- name: train-node
type: execution
step: train-model
- name: deploy-node
type: deployment
deployment: mydeployment
endpoints:
- predict-digit
edges:
- [generate-node.output.images*, train-node.input.dataset-images]
- [generate-node.output.labels*, train-node.input.dataset-labels]
- [train-node.output.model*, deploy-node.file.predict-digit.model]
- endpoint:
name: predict-digit
description: predict digits from image inputs ("file" parameter)
image: tensorflow/tensorflow:1.13.1-py3
wsgi: predict_wsgi:predict_wsgi
files:
- name: model
description: Model output file from TensorFlow
path: model.pb
Here we have a pipeline with 3 nodes, and the second node train will wait its inputs to be generated
by generate node. The third node deploys the model outputted by the train node. All files in /valohai/outputs
that start with either images
or labels
will be passed
between the executions.
Override default inputs
In the above example:
- The
train-model
step has two inputs, each with their own default values. - The pipeline we defines that the
train-model
node should use the outputs ofgenerate-dataset
as its inputs.
By default Valohai will include both files from the default input location and the files generated by the pipeline as the step’s inputs. You can specify an override in the pipeline, if instead you want the input from the pipeline to override the default input.
- name: train
type: execution
step: train-model
override:
inputs:
- name: dataset-images
- name: dataset-labels
See also