You can add conditions to your pipeline for example to stop a pipeline if certain metrics are below an expected range, or pause a pipeline before a human has reviewed the resutls.
This is just a short recap
We strongly recommend completing the Mastering Valohai learning path on Valohai Academy.
This guide will provide you with a overview and a “cheatsheet” for migrating projects. It won’t explain the concepts and all the options in detail.
A Valohai pipeline consists of:
nodesthat represent a single job inside the pipeline. Most commonly this is an execution but it could also be a task or a real-time deployment.
edgesthat represent how does the output of one job connect to the input of another job, for example output file -> input file, or metadata value -> parameter value.
Before you can define a pipeline, you’ll need to define
steps and make sure they work as expected.
Write a valohai.yaml
A pipeline is defined in the
valohai.yaml. Let’s assume we already have 3 working steps, and we want to connect them together into a pipeline.
preprocesswill output a new dataset (
dataset://images/latest-train) that will be connected to the
train-modelwill generate a new model file that will be used for
batch-inferencewill output results
- step: name: preprocess image: docker.io/python:3.10 command: - pip install -r requirements.txt - python data-preprocess.py - step: name: train-model image: tensorflow/tensorflow:2.6.0 command: - python train_model.py inputs: - name: data default: dataset://images/latest-train - step: name: batch-inference image: tensorflow/tensorflow:2.6.0 command: - pip install pillow - python batch_inference.py inputs: - name: test-images default: dataset://images/latest-test - name: model default: datum://production-latest - pipeline: name: train-inference-pipeline nodes: - name: preprocess-node type: execution step: preprocess - name: train-model-node type: execution step: train-model - name: batch-inference-node type: execution step: batch-inference edges: - [preprocess-node.outputs.*, train-model-node.inputs.data] - [train-model-node.outputs.*.pkl, batch-inference-node.inputs.model]
You can now run your pipeline from your local code (adhoc) with:
vh pipeline run train-inference-pipeline --adhoc
valohai-utils users can define pipelines using Python.
You can create a new file called
from valohai import Pipeline def main(config) -> Pipeline: #Create a pipeline called "utilspipeline". pipe = Pipeline(name="train-inference-pipeline", config=config) # Define the pipeline nodes. preprocess = pipe.execution("preprocess") train = pipe.execution("train-model") inference = pipe.execution("batch-inference") # Configure the pipeline, i.e. define the edges. preprocess.output("*").to(train.input("data")) preprocess.output("*.pkl").to(inference.input("model")) return pipe
You can now generate the pipeline’s YAML definition with:
vh yaml pipeline example_pipeline.py