You can easily migrate your existing projects to Valohai.
There are multiple ways to do this but the one requiring the least work is to add a new file valohai.yaml
and list your job types (=steps) in there.
This is just a short recap
We strongly recommend completing the Mastering Valohai learning path on Valohai Academy.
This guide will provide you with a overview and a “cheatsheet” for migrating projects. It won’t explain the concepts and all the options in detail.
Write a valohai.yaml
Migrating a project to Valohai usually starts by defining the steps in valohai.yaml
. Each step defines one type of a job in your project, for example data-preprocessing
, training
, evaluation
, predict
, etc.
In it’s simplest form a Valohai step definition looks like this:
- step:
name: preprocess
image: docker.io/python:3.10
command:
- pip install -r requirements.txt
- python data-preprocess.py
name
should be unique in eachvalohai.yaml
file. It’s how you identify different job types.image
points to a published Docker image that has (most of) the packages needed to run your code, for example:numpy
,matplotlib
,sqlite3
, etc.command
tells Valohai what to do when you say “Run preprocess”. In this case we’re running two commands: install additional packages and then run data-preprocess.py.
Where do those Docker images comes from?
You can either use public Docker images, for example from Docker Hub where you’ll find the official images for Python, Tensorflow, PyTorch, Ultralytics YOLO, images for R, and many others.
You can also use Private Docker Registries, such as AWS ECR, Azure Container Registry, Google Cloud Artifact Registry, and any registry that support username + password/token login.
Example valohai.yaml
How your valohai.yaml
looks will depend on what your project is about. You should start add one step at a time, run it in Valohai and make sure it’s working as expected.
An example of a basic valohai.yaml
file could look like:
- step:
name: preprocess
image: docker.io/python:3.10
command:
- pip install -r requirements.txt
- python data-preprocess.py
- step:
name: train-model
image: tensorflow/tensorflow:2.6.0
command:
- python train_model.py
- step:
name: batch-inference
image: tensorflow/tensorflow:2.6.0
command:
- pip install pillow
- python batch_inference.py
What if you don’t use pip?
Valohai doesn’t restrict what you can run inside your jobs. Instead of running pip install
you can also install packages using for example conda
or apt-get install -y
.
Running jobs
You can either launch a Valohai Execution using your local code or by pushing your changes to Git and fetching them to Valohai.
In either case, make sure you’ve installed valohai-cli
on your own machine and logged in.
pip install valohai-cli
# or alternatively
# pipx install valohai-cli
# login with your Valohai credentials
vh login
# create a new Valohai project
vh project create
# or alternatively link to an existing project
# vh project link
Run with local code
You can run a job from local code using the step name:
vh execution run preprocess --adhoc --watch
Run with code from Git
Authorize Valohai to your repository
Start by authenticating your Valohai project with your private Git repository. In your browser go to your repository (GitHub, GitLab, BitBucket, Azure DevOps, etc.) and add a new public key.
Now go the Valohai web application and find the project you created or linked to.
- Open the Settings tab
- Go to the Repository tab
- Provide the URL and the Private Key
- Save changes
Run based on a commit
Start by pushing your changes to Git, like you normally would. For example:
git add valohai.yaml
git commit -m "Added a blueprint for Valohai jobs"
git push
Now from your local command-line you can run:
vh project fetch
vh execution run preprocess --open-browser
You could also do this from the web application: * Open your project in the Valohai App * Click on the “Fetch repository” button on the right side * Click on “Create execution” and start creating a new execution