# Define Your Job Types

Migrating to Valohai is simpler than you think. You can run your existing code with just a YAML file — no rewrites, no vendor lock-in.

This guide shows you how to define your ML jobs and run them on Valohai's infrastructure while keeping your code portable.

> 💡 **Already using MLflow, W\&B, or other tools?** Keep using them. Valohai runs your code as-is.

## Step 0: Keep Your Code Portable

The best code for Valohai is code that runs anywhere. Before defining your jobs:

* **Remove vendor-specific decorators** (they'll still work, but why lock yourself in?)
* **Use standard Python/R code** that runs locally
* **Keep dependencies explicit** in `requirements.txt` or similar

Your code stays yours — portable in and out of Valohai.

> 💡 Want to use some other language than Python or R? No problem! Valohai jobs run inside Docker containers so you just need to provide a suitable image in your `valohai.yaml`.

## Step 1: Create valohai.yaml

Add a `valohai.yaml` file to your project root. This tells Valohai what job types exist in your project.

A minimal example:

```yaml
- step:
    name: train-model
    image: docker.io/python:3.10
    command:
        - pip install -r requirements.txt
        - python train.py
```

That's it. Your existing `train.py` runs unchanged.

### What's in a Step?

* **name**: How you'll reference this job type (e.g., `preprocessing`, `training`, `evaluation`)
* **image**: A Docker image with your base dependencies (Python, TensorFlow, etc.)
* **command**: Exactly what you'd run locally

> 💡 **No Docker experience?** Start with official images like `python:3.10` or `tensorflow/tensorflow:2.6.0` from [Docker Hub](https://hub.docker.com/).

## Step 2: Run Your First Execution

An "execution" is just Valohai's term for running your job once.

### Quick Test with Local Code

```shell
# Install Valohai CLI
pip install valohai-cli

# Login and create project
vh login
vh project create

# Run your job
vh execution run train-model --adhoc
```

Your code runs on Valohai's infrastructure, but behaves exactly like it does locally.

### Production Runs from Git

Once you're happy, push to Git and run from there:

```shell
git add valohai.yaml
git commit -m "Add Valohai job definitions"
git push

# Fetch and run
vh project fetch
vh execution run train-model
```

## Common Patterns

### Multiple Job Types

Define all your workflow steps:

```yaml
- step:
    name: preprocess
    image: docker.io/python:3.10
    command:
        - python preprocess_data.py

- step:
    name: train
    image: tensorflow/tensorflow:2.6.0
    command:
        - python train_model.py

- step:
    name: evaluate
    image: tensorflow/tensorflow:2.6.0
    command:
        - python evaluate.py
```

### Using Private Registries

Have custom Docker images? After [connecting a private Docker registry to Valohai](/docker-in-valohai/private-docker-registries.md), you can use images from there in your executions:

```yaml
- step:
    name: train
    image: myregistry.azurecr.io/ml-base:latest
    command:
        - python train.py
```

### Non-pip Dependencies

Valohai doesn't restrict what you can run inside your jobs. Instead of running `pip install` you can also install packages using for example `conda` or `apt-get install -y`.

```yaml
- step:
    name: train
    image: continuumio/miniconda3
    command:
        - conda install pytorch -c pytorch -y
        - python train.py
```

## What About My Existing Tools?

Keep using them. Valohai runs your code as-is:

* **MLflow tracking?** Works
* **Weights & Biases?** Works
* **TensorBoard?** Works
* **Custom logging?** Works

You can migrate gradually — or not at all. Your choice.

## Next Steps

1. **Try one job** — Start with your simplest script
2. **Add parameters** — Make jobs configurable (covered in the next guide)
3. **Handle data** — Connect to your data sources
4. **Build pipelines** — Chain jobs together

> 💡 **Want the full picture?** Check out [Valohai Academy](https://learn.valohai.academy/paths) for comprehensive tutorials.

## Quick Reference

### Minimal valohai.yaml

```yaml
- step:
    name: my-job
    image: docker.io/python:3.10
    command:
        - python my_script.py
```

### CLI Commands

```shell
vh login                        # One-time setup
vh project create               # New project
vh execution run my-job --adhoc # Run with local code
vh execution run my-job         # Run from Git
```

### Common Docker Images

* `python:3.10` — Standard Python
* `tensorflow/tensorflow:2.6.0` — TensorFlow CPU only
* `tensorflow/tensorflow:2.6.0-gpu` — TensorFlow with GPU
* `pytorch/pytorch:2.0.0-cuda11.7-cudnn8-runtime` — PyTorch
* `rocker/r-base:4.3.0` — R language

***

**Bottom line:** If your code runs locally, it runs on Valohai. No rewrites needed.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.valohai.com/migration-strategy/migrate-job-yaml.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
