# Defining Jobs

> 💡 **About this tutorial:** We use YOLOv8 as a practical example to demonstrate Valohai's features. You don't need computer vision knowledge—the patterns you learn here apply to any ML framework. This tutorial focuses on defining jobs (=executions) and saving their output files while ensuring proper versioning and tracking of your ML workflows.

### Prerequisites

* Python 3.8 or later
* A Valohai account ([create one free](https://app.valohai.com))

### Install the CLI

Get the Valohai CLI and utilities for experiment tracking:

```shell
pip install valohai-cli
```

> **Tip:** Use `pipx install valohai-cli` to avoid dependency conflicts.

### Login

```shell
vh login
```

<details>

<summary><strong>Using SSO or self-hosted Valohai?</strong></summary>

**SSO Users (Azure, Google, SAML)**

Generate a personal access token:

1. Go to **Hi, \[username]** → **My Profile** → **Authentication**
2. Click **Manage Tokens** → **Create a new personal token**
3. Save the token securely (shown only once)

```shell
vh login --token <your-token>
```

**Self-Hosted Installations**

```shell
vh login --host https://your-company.valohai.io
```

> 💡 Combine options: `vh login --host https://your-company.valohai.io --token <token>`

</details>

### Create Your First Project

Set up a project directory and connect it to Valohai:

```shell
mkdir my-ml-project
cd my-ml-project
vh project create --name my-ml-project
```

This links your local directory to Valohai for experiment tracking.

### Write Your Training Script

Save this as `train.py`. Half of it is the standard model training from Yolo and the other half is Valohai specific where we:

* Copy the trained model files to `/valohai/outputs/` direction, from Valohai will version and upload them
* The yolo training script will develop several files, including a file called `best.onnx`
* We'll want this create an Valohai alias called `latest-model` that we can reference, and it'll always point to the newest file generated by this job. (that we'll use for inferencing later)
* Generate a JSON file called `best.onnx.metadata.json` where we put in some JSON, to define that we want this the alias `latest-model` to point to this newly generated file. Read more about aliases in our alias docs.

```python
import shutil
from ultralytics import YOLO
import json

# Load a pretrained model (recommended for training)
model = YOLO("yolov8n.pt")

# Train the model
model.train(data="coco128.yaml", epochs=1, verbose=False)

# Export the model to ONNX format
path = model.export(format="onnx")

# Valohai parts start here

# Copy the exported model to the Valohai outputs directory
shutil.copy(path, "/valohai/outputs/")

# Define a JSON dictionary containing a friendly name
# You can then reference this file with datum://latest-model
file_metadata = {
    "valohai.alias": "latest-model",
}

# Attach the metadata to the file
with open("/valohai/outputs/best.onnx.metadata.json", "w") as f:
    f.write(file_metadata)
```

### Configure Your Execution Environment

Create `valohai.yaml` to define how your code runs:

Update the environment field with the GPU machine you want to run the job on. You can see a list of available GPU machines by running vh environments --gpu. Use the slug name provided in the output.

```yaml
- step:
    name: yolo
    image: docker.io/ultralytics/ultralytics:8.0.180-python
    command: python train.py
    environment: aws-eu-west-1-p3-2xlarge
```

#### About Docker Images

The `image` field specifies your execution environment. Think of it as a clean environment with only the software you specify. For example:

* **Python**: `python:3.9`, `python:3.11`
* **TensorFlow**: `tensorflow/tensorflow:2.13.0`
* **PyTorch**: `pytorch/pytorch:2.0.0-cuda11.7-cudnn8-runtime`
* **R**: `r-base:4.3.0`
* **Custom**: Your own Docker images with specific dependencies

> **Note:** The Docker image can different from your local Python version. Valohai runs your code in this isolated environment for reproducibility.

### Run Your First Execution

Submit the job and watch logs in real-time:

```shell
vh execution run yolo --adhoc --watch
```

What happens:

* `--adhoc` uploads your current code without pushing to Git (great for testing)
* `--watch` streams logs to your terminal
* Valohai tracks all inputs, outputs, and parameters automatically

### View Results in the UI

Open your execution in the browser:

```shell
vh execution open
```

The UI shows:

* Real-time logs and metrics charts
* Parameter values and configuration
* Output files (downloadable)
* Full reproducibility information

***

#### Troubleshooting

<details>

<summary><strong>Command 'vh' not found</strong></summary>

The CLI isn't in your system PATH. Fix it:

**macOS/Linux:**

```bash
echo 'export PATH="$HOME/.local/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
```

**Windows:** Add the Python Scripts folder to your PATH environment variable.

**Alternative:** Run with `python -m valohai_cli` instead of `vh`.

</details>

<details>

<summary><strong>Project not linked</strong></summary>

Run `vh project link` to connect your directory to an existing Valohai project.

</details>

<details>

<summary><strong>Import error for valohai module</strong></summary>

Install the Valohai utilities as a part of your step in `valohai.yaml`:

```shell
pip install valohai-utils
```

</details>


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.valohai.com/getting-started/intro/jobs.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
