Start by creating an account at app.valohai.com
Is your team already on Valohai?
Most Valohai users will login from app.valohai.com but some teams have a self-hosted Valohai installation. Check with your team to see what’s the right domain for you.
Install the tools
Install the valohai-cli
tools by running in your local machine, or where you write your scripts in:
pipx install valohai-cli
# Then use your credentials to login
vh login
pipx not found?
pipx is an utility to install and run Python applications in isolated environments. https://pypa.github.io/pipx/installation/
You can also install Valohai with pip
or pip3
, depending on your environment.
Create an project
Create a directory on your local machine
mkdir hello-valohai
Navigate to the folder to create a new Valohai project and link local commands to it. This ensures Valohai knows which project to use when running a job from your machine.
vh project create --name hello-valohai
Create a hello.py
About this example
We won’t use highly complex examples in our quickstart. Our goal is to demonstrate how to launch jobs to remote cloud or on-premises machines using Valohai while ensuring versioning and tracking are functional.
This quickstart is based on the YOLOv8 quickstart.
First, create a file called train.py
and paste a simple yolov8
example:
import shutil
from ultralytics import YOLO
import json
# Load a model
model = YOLO("yolov8n.pt") # load a pretrained model (recommended for training)
# Use the model
model.train(data="coco128.yaml", epochs=1, verbose=False) # train the model
path = model.export(format="onnx") # export the model to ONNX format
Next, let’s edit the code to copy the generated model to Valohai outputs, and creating an alias, so we can easily reference the file later with the alias “latest-model”.
import shutil
from ultralytics import YOLO
import json
# Load a model
model = YOLO("yolov8n.pt") # load a pretrained model (recommended for training)
# Use the model
model.train(data="coco128.yaml", epochs=1, verbose=False) # train the model
path = model.export(format="onnx") # export the model to ONNX format
# Copy the exported model to the Valohai outputs directory
shutil.copy(path, '/valohai/outputs/')
file_metadata = {
"valohai.alias": "latest-model"
}
# Attach the metadata to our file
# So we can easily find this find with the alias defined
with open("/valohai/outputs/best.onnx.metadata.json", "w") as outfile:
f.write(json.dumps(file_metadata))
Create a valohai.yaml
Finally, let’s create a valohai.yaml configuration file to describe our job for Valohai. This configuration will include a step name, a command to guide the execution of our script, environment and a base environment containing Python and YOLOv8 libraries, sourced from a Docker image on hub.docker.com.
Update the environment with the GPU machine you want to run the job on. You can see list of available GPU machines by typing vh environments --gpu
. Use the slug name in the YAML.
- step:
name: yolo
image: docker.io/ultralytics/ultralytics:8.0.180-python
command: python train.py
environment: aws-eu-west-1-p3-2xlarge
Run from the command-line
Finally, execute our training on a remote cloud or on-premises machine by sending the job to the Valohai scheduler. Run the following command in your local command-line:
vh execution run yolo --adhoc --open-browser
A new browser tab will open where you’ll find:
- Valohai execution details, including a snapcode of the code used to run the job.
- All standard logs.
- Upon job completion, you’ll find the trained model in the outputs tab. It will be versioned and uploaded to the designated data store.
Stream logs to your command-line
You can include the --watch
flag to stream the logs back to your command-line:
bash
vh execution run yolo --adhoc --watch