Collecting metrics with the Valohai metadata system is straightforward. It becomes active whenever your script prints JSON.
Metrics offer several advantages:
- They allow you to sort jobs by metric values in the Executions table.
- They enable you to select one or multiple jobs and compare their metrics through a graph.
- They support the definition of “early stopping” rules to halt a job once certain metric thresholds are met.
- They allow you to set conditions in your pipeline to advance to the next stage only when specific metric criteria are satisfied.
Print final metrics as JSON
YOLOv8
’s trainer already provides metrics in JSON format, so you can simply print them:
Edit your train.py
file and print the metrics in JSON:
import shutil
from ultralytics import YOLO
import argparse
import json
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('--epochs', type=int, default=3)
parser.add_argument('--verbose', type=bool, default=False)
return parser.parse_args()
args = parse_args()
# Load a model
model = YOLO("yolov8n.pt") # Load a pretrained model (recommended for training)
# Use the model
model.train(data="coco128.yaml", epochs=args.epochs, verbose=args.verbose) # Train the model
path = model.export(format="onnx") # Export the model to ONNX format
metadata = {}
# Loop through the metrics
for metric in model.metrics.results_dict:
# Some metrics have a 'metrics/' prefix (e.g., metrics/precision)
# We split it to isolate the actual metric name.
metric_name = metric.split("metrics/")[-1]
metric_value = model.metrics.results_dict[metric]
metadata[metric_name] = metric_value
# Print the JSON dictionary to register metrics and their values in Valohai
print(json.dumps(metadata))
# Copy the exported model to the Valohai outputs directory
shutil.copy(path, '/valohai/outputs/')
# Define a JSON dictionary containing a friendly name
# You can then reference this file with datum://latest-model
file_metadata = {
"valohai.alias": "latest-model"
}
# Attach the metadata to the file
with open("/valohai/outputs/best.onnx.metadata.json", "w") as f:
f.write(file_metadata)
Run from the command-line
You can now run your training script and observe the JSON metrics printed in the log. Once the job finishes, navigate to the Executions tab to view the metrics in a table format (marked as number 1 in the image below). They will be tracked for each job.
vh execution run yolo --adhoc --open-browser
Customize the table view
You can customize which columns are displayed in the table and set your preferred decimal precision using the controls located on the right side above the table (marked as number 2 in the image).
Print metrics after each epoch
In the example above, metrics are printed once the training finishes. However, you can also print and visualize metrics as the training proceeds.
Different frameworks have their own approaches for achieving this. For YOLOv8, it involves using callbacks. For examples with other frameworks and libraries, refer to the Metrics & Visualization documentation.
Edit your train.py
file to include a custom callback method:
import shutil
from ultralytics import YOLO
import argparse
import json
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('--epochs', type=int, default=2)
parser.add_argument('--verbose', type=bool, default=False)
return parser.parse_args()
args = parse_args()
def print_valohai_metrics(trainer):
metadata = {
"epoch": trainer.epoch,
}
# Loop through the metrics
for metric in trainer.metrics:
metric_name = metric.split("metrics/")[-1]
metric_value = trainer.metrics[metric]
metadata[metric_name] = metric_value
print(json.dumps(metadata))
# Load a model
model = YOLO("yolov8n.pt") # Load a pretrained model (recommended for training)
model.add_callback("on_train_epoch_end", print_valohai_metrics)
# Use the model
model.train(data="coco128.yaml", epochs=args.epochs, verbose=args.verbose) # Train the model
path = model.export(format="onnx") # Export the model to ONNX format
metadata = {}
# Loop through the metrics
for metric in model.metrics.results_dict:
# Some metrics have a 'metrics/' prefix
# Splitting it yields the actual metric name
metric_name = metric.split("metrics/")[-1]
metric_value = model.metrics.results_dict[metric]
metadata[metric_name] = metric_value
# Valohai metrics are collected as JSON key:value pairs
print(json.dumps(metadata))
# Copy the exported model to the Valohai outputs directory
shutil.copy(path, '/valohai/outputs/')
# Define a JSON dictionary containing a friendly name
# You can reference this file with datum://latest-model
file_metadata = {
"valohai.alias": "latest-model"
}
# Attach the metadata to the file
with open("/valohai/outputs/best.onnx.metadata.json", "w") as f:
f.write(json.dumps(file_metadata))
Run your training script again with additional epochs, and notice that JSON metrics are now printed in the log after each epoch.
As soon as the “Metadata” tab is available, open it. Then, select “epoch” for the horizontal axis and choose any metric for the vertical axis on the right to visualize your training progress as the job runs and after it completes.
vh execution run yolo --epochs=10 --adhoc --open-browser
Compare jobs
On the Executions tab, you can select multiple executions by checking the boxes at the beginning of each row. After choosing the executions you want to compare, click “Compare” located above the table. This allows you to view and compare their metrics side-by-side.