Collect Metrics from Output Files (YOLO & Others)
The Problem
runs/train/exp/
├── results.csv # Training metrics per epoch
├── weights/
│ ├── best.pt # Best model
│ └── last.pt # Latest model
└── ...Quick Example
valohai.yaml
- step:
name: train-yolov8
image: ultralytics/yolov8:latest
command:
- git clone https://github.com/ultralytics/yolov8.git
- tar -xf /valohai/inputs/dataset/coco128.tar
- pip install watchdog
- nohup python ./scripts/valohai_watch.py & # Start watcher in background
- python yolov8/train.py --data coco128.yaml --epochs {parameters}
inputs:
- name: dataset
default: https://github.com/ultralytics/yolov8/releases/download/v1.0/coco128.tar.xz
parameters:
- name: epochs
type: integer
default: 10
environment: aws-eu-west-1-g4dn-xlargescripts/valohai_watch.py
How It Works
Complete Working Example
scripts/valohai_watch.py (Complete)
Adapting for Other File Formats
JSON Files
Text Files with Key-Value Pairs
TensorBoard Event Files
Best Practices
Start Watcher Before Training
Use nohup for Background Execution
nohup for Background ExecutionHandle Partial Writes
Filter by Filename Pattern
Error Handling
Common Issues
Watcher Not Starting
Metrics Logged Multiple Times
File Not Found Errors
When to Use This Pattern
Example Project
Next Steps
Last updated
Was this helpful?
