Define Your Job Types

Migrating to Valohai is simpler than you think. You can run your existing code with just a YAML file — no rewrites, no vendor lock-in.

This guide shows you how to define your ML jobs and run them on Valohai's infrastructure while keeping your code portable.

💡 Already using MLflow, W&B, or other tools? Keep using them. Valohai runs your code as-is.

Step 0: Keep Your Code Portable

The best code for Valohai is code that runs anywhere. Before defining your jobs:

  • Remove vendor-specific decorators (they'll still work, but why lock yourself in?)

  • Use standard Python/R code that runs locally

  • Keep dependencies explicit in requirements.txt or similar

Your code stays yours — portable in and out of Valohai.

💡 Want to use some other language than Python or R? No problem! Valohai jobs run inside Docker containers so you just need to provide a suitable image in your valohai.yaml.

Step 1: Create valohai.yaml

Add a valohai.yaml file to your project root. This tells Valohai what job types exist in your project.

A minimal example:

- step:
    name: train-model
    image: docker.io/python:3.10
    command:
        - pip install -r requirements.txt
        - python train.py

That's it. Your existing train.py runs unchanged.

What's in a Step?

  • name: How you'll reference this job type (e.g., preprocessing, training, evaluation)

  • image: A Docker image with your base dependencies (Python, TensorFlow, etc.)

  • command: Exactly what you'd run locally

💡 No Docker experience? Start with official images like python:3.10 or tensorflow/tensorflow:2.6.0 from Docker Hub.

Step 2: Run Your First Execution

An "execution" is just Valohai's term for running your job once.

Quick Test with Local Code

# Install Valohai CLI
pip install valohai-cli

# Login and create project
vh login
vh project create

# Run your job
vh execution run train-model --adhoc

Your code runs on Valohai's infrastructure, but behaves exactly like it does locally.

Production Runs from Git

Once you're happy, push to Git and run from there:

git add valohai.yaml
git commit -m "Add Valohai job definitions"
git push

# Fetch and run
vh project fetch
vh execution run train-model

Common Patterns

Multiple Job Types

Define all your workflow steps:

- step:
    name: preprocess
    image: docker.io/python:3.10
    command:
        - python preprocess_data.py

- step:
    name: train
    image: tensorflow/tensorflow:2.6.0
    command:
        - python train_model.py

- step:
    name: evaluate
    image: tensorflow/tensorflow:2.6.0
    command:
        - python evaluate.py

Using Private Registries

Have custom Docker images? After connecting a private Docker registry to Valohai, you can use images from there in your executions:

- step:
    name: train
    image: myregistry.azurecr.io/ml-base:latest
    command:
        - python train.py

Non-pip Dependencies

Valohai doesn’t restrict what you can run inside your jobs. Instead of running pip install you can also install packages using for example conda or apt-get install -y.

- step:
    name: train
    image: continuumio/miniconda3
    command:
        - conda install pytorch -c pytorch -y
        - python train.py

What About My Existing Tools?

Keep using them. Valohai runs your code as-is:

  • MLflow tracking? Works

  • Weights & Biases? Works

  • TensorBoard? Works

  • Custom logging? Works

You can migrate gradually — or not at all. Your choice.

Next Steps

  1. Try one job — Start with your simplest script

  2. Add parameters — Make jobs configurable (covered in the next guide)

  3. Handle data — Connect to your data sources

  4. Build pipelines — Chain jobs together

💡 Want the full picture? Check out Valohai Academy for comprehensive tutorials.

Quick Reference

Minimal valohai.yaml

- step:
    name: my-job
    image: docker.io/python:3.10
    command:
        - python my_script.py

CLI Commands

vh login                        # One-time setup
vh project create              # New project
vh execution run my-job --adhoc # Run with local code
vh execution run my-job        # Run from Git

Common Docker Images

  • python:3.10 — Standard Python

  • tensorflow/tensorflow:2.6.0 — TensorFlow CPU only

  • tensorflow/tensorflow:2.6.0-gpu — TensorFlow with GPU

  • pytorch/pytorch:2.0.0-cuda11.7-cudnn8-runtime — PyTorch

  • rocker/r-base:4.3.0 — R language


Bottom line: If your code runs locally, it runs on Valohai. No rewrites needed.

Last updated

Was this helpful?