Collecting metrics with the Valohai metadata system is straightforward; it activates when you print JSON from your script.
Metrics have several benefits:
- Enables sorting jobs by metric values in the Executions table
- Allows you to select one or multiple jobs and compare them in a graph.
- Enables the definition of “early stopping” rules to stop a job once specific metric thresholds are reached.
- Supports the creation of conditions in your pipeline to progress to the next stage only when certain metric conditions are satisfied.
Print final metrics as JSON
YOLOv8
’s trainer already provides metrics in a JSON format, so we can just print them out:
Edit your train.py and print out the metrics in JSON:
import shutil
from ultralytics import YOLO
import argparse
import json
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('--epochs', type=int, default=3)
parser.add_argument('--verbose', type=bool, default=False)
return parser.parse_args()
args = parse_args()
# Load a model
model = YOLO("yolov8n.pt") # load a pretrained model (recommended for training)
# Use the model
model.train(data="coco128.yaml", epochs=args.epochs, verbose=args.verbose) # train the model
path = model.export(format="onnx") # export the model to ONNX format
metadata = {}
# Loop through the metrics
for metric in model.metrics.results_dict:
# Some of the metrics are named with a metrics/ prefix
# for example: metrics/precision
# lets split it to just get precision
metric_name = metric.split("metrics/")[-1]
metric_value = model.metrics.results_dict[metric]
metadata[metric_name] = metric_value
# Print the JSON dictionary to register metrics and their values to Valohai
print(json.dumps(metadata))
# Copy the exported model to the Valohai outputs directory
shutil.copy(path, '/valohai/outputs/')
# Define a JSON dictionary that contains a friendly name
# We can then point to this file with datum://latest-model
file_metadata = {
"valohai.alias": "latest-model"
}
# Attach the metadata to our file
with open("/valohai/outputs/best.onnx.metadata.json", "w") as f:
f.write(file_metadata)
Run from the command-line
Now you can execute your training script, and you’ll see the JSON metrics printed in the log. Once the job completes, you can navigate to the Executions tab to see the metrics in a table format (number 1 in the picture below). They will now be tracked for each job.
vh execution run yolo --adhoc --open-browser
Customize the table view
You have the flexibility to customize the columns displayed in the table and choose the decimal precision you prefer by using the controls located on the right side above the table (number 2 in the picture above).
Print metrics after each epoch
In the example provided, metrics are printed out after the training is finished. However, you can also print and visualize metrics while the training is ongoing.
Various frameworks have distinct methods for accomplishing this, and in the case of YOLOv8, it involves the use of callbacks. To explore examples with other frameworks and libraries, refer to the documentation for Metrics & Visualization.
Edit your train.py to include a custom callback method
import shutil
from ultralytics import YOLO
import argparse
import json
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('--epochs', type=int, default=2)
parser.add_argument('--verbose', type=bool, default=False)
return parser.parse_args()
args = parse_args()
def print_valohai_metrics(trainer):
metadata = {
"epoch": trainer.epoch,
}
# Loop through the metrics
for metric in trainer.metrics:
metric_name = metric.split("metrics/")[-1]
metric_value = trainer.metrics[metric]
metadata[metric_name] = metric_value
print(json.dumps(metadata))
# Load a model
model = YOLO("yolov8n.pt") # load a pretrained model (recommended for training)
model.add_callback("on_train_epoch_end", print_valohai_metrics)
# Use the model
model.train(data="coco128.yaml", epochs=args.epochs, verbose=args.verbose) # train the model
path = model.export(format="onnx") # export the model to ONNX format
metadata = {}
# Loop through the metrics
for metric in model.metrics.results_dict:
# Some of the metrics are named with a metrics/ prefix
# for example: metrics/precision
# lets split it to just get precision
metric_name = metric.split("metrics/")[-1]
metric_value = model.metrics.results_dict[metric]
metadata[metric_name] = metric_value
# Valohai metrics are collected as JSON key:value pairs
print(json.dumps(metadata))
# Copy the exported model to the Valohai outputs directory
shutil.copy(path, '/valohai/outputs/')
# Define a JSON dictionary that contains a friendly name
# We can then point to this file with datum://latest-model
file_metadata = {
"valohai.alias": "latest-model"
}
# Attach the metadata to our file
with open("/valohai/outputs/best.onnx.metadata.json", "w") as f:
f.write(json.dumps(file_metadata))
Run your training script, allowing it to run through additional epochs, and you will notice JSON metrics being printed in the log after each epoch.
Once the “Metadata” tab becomes active, select it and choose epoch for the horizontal axis and any desired metric for the vertical axis on the right to visualize your training progress while the job runs, and after it.
vh execution run yolo --epochs=10 --adhoc --open-browser
Compare jobs
On the Executions tab, you can select multiple executions by clicking the checkbox at the beginning of each row. After selecting the desired executions, click on “Compare” located above the table to compare them.