Each node of a pipeline can run in a different type of an environment, like a different type of a cloud instance or an on-premise machine
This flexibility allows you to tailor each step of your pipeline according to specific computational needs or data locality requirements. Below are the methods you can use to customize environments for your pipeline steps.
Using the web app
- Create a new pipeline.
- Click on the specific node in the pipeline graph.
- Choose the desired environment from the dropdown menu in the “Runtime” section.
Define Environments in valohai.yaml for each step
- In your
valohai.yaml
, use theenvironment
property in the step definition. - The value should be the environment slug, which you can obtain by running
vh environments
in your CLI. - If this property is not defined, the default environment set for your organization or project will be used.
- step:
name: preprocess-dataset
image: python:3.9
environment: aws-eu-west-1-g3s-xlarge
command:
- pip install numpy valohai-utils
- python ./preprocess_dataset.py
inputs:
- name: dataset
default: https://valohaidemo.blob.core.windows.net/mnist/mnist.npz
In this approach, you define the desired environment explicitly for each step in your pipeline. For example, you can specify that some steps should run on-premises, while others should run on AWS. By setting the environment at the step level, you have fine-grained control over which steps run where.
Override Environment in valohai.yaml
In some cases, you may want to change the environment if the execution/Task is running inside a pipeline. You can do this with pipeline overrides in valohai.yaml
.
- pipeline:
name: Training Pipeline
nodes:
- name: preprocess
type: execution
step: preprocess-dataset
override:
environment: aws-eu-west-1-g3s-xlarge
- name: train
type: execution
step: train-model
- name: evaluate
type: execution
step: batch-inference
edges:
- [preprocess.output.preprocessed_mnist.npz, train.input.dataset]
- [train.output.model*, evaluate.input.model]
By default, all steps in a pipeline inherit the environment defined in the pipeline configuration. However, you can provide an override for specific steps to use a different environment.
This can be useful when you want most of your steps to run in a particular environment but need exceptions for certain steps.
By following these approaches, you can easily incorporate different steps or environments within a pipeline, allowing for greater flexibility and customization based on your specific requirements.