TensorFlow/Keras
TensorFlow and Keras provide a callback system that makes metric logging clean and automatic. Create a custom callback to log metrics at the end of each epoch without cluttering your training code.
Quick Example
import tensorflow as tf
import valohai
class ValohaiMetricsCallback(tf.keras.callbacks.Callback):
def on_epoch_end(self, epoch, logs=None):
with valohai.metadata.logger() as logger:
logger.log("epoch", epoch + 1)
logger.log("accuracy", logs["accuracy"])
logger.log("loss", logs["loss"])
logger.log("val_accuracy", logs["val_accuracy"])
logger.log("val_loss", logs["val_loss"])
# Use the callback
model.fit(
train_dataset,
validation_data=val_dataset,
epochs=10,
callbacks=[ValohaiMetricsCallback()],
)Why Use Callbacks?
Keras callbacks run at specific points during training. They let you:
Access all training metrics automatically
Keep metric logging separate from model code
Reuse the same callback across projects
Complete Working Example
Here's a full training script with Valohai integration:
valohai.yaml Configuration
Make sure to change the input data and environment to match your own values.
Logging Without valohai-utils
You can also log metrics using plain JSON:
Logging Learning Rate
Track learning rate changes during training:
Combining Multiple Callbacks
Use multiple callbacks together:
Logging Custom Metrics
Add your own computed metrics:
Using LambdaCallback (Shorter Syntax)
For simple logging, use LambdaCallback:
Logging Per-Batch Metrics (Advanced)
For very long epochs, you might want to log progress mid-epoch:
Use sparingly: Logging every batch creates a lot of data. Only use for debugging or very long epochs.
Best Practices
Always Convert to Python Types
Keras metrics are NumPy types. Convert to Python types for JSON serialization:
Handle Missing Metrics
Not all metrics are available in every callback:
Use Descriptive Metric Names
Keep names consistent with Keras conventions:
Common Issues
Metrics Not Appearing
Symptom: Callback runs but no metrics in Valohai
Causes & Fixes:
Missing
validation_data→ Add validation split or dataIncorrect metric names → Check available keys in
logsJSON serialization error → Convert NumPy/Tensor types to float
Debug:
Validation Metrics Missing
Symptom: Only training metrics logged, no validation metrics
Solution: Make sure you provide validation data:
Example Project
Check out our complete working example on GitHub:
The repository includes:
Complete training script with Valohai integration
valohai.yamlconfigurationExample notebooks
Step-by-step setup instructions
Next Steps
Visualize your metrics in Valohai
Compare experiments to find the best hyperparameters
Learn more about Keras callbacks
Back to Collect Metrics overview
Last updated
Was this helpful?
