Early Stopping
Early stopping automatically stops executions or Tasks when they meet specific metadata conditions.
This saves time and compute costs by stopping work that's no longer useful, like a model that's already hit your accuracy target or an experiment that's clearly failing.
How it works
You define a stopping rule based on metadata that your training code logs (e.g., accuracy, val_loss, epoch).
Valohai monitors the metadata during execution. When the condition is met, the execution stops immediately.
For Tasks, you can also configure what happens to other executions when one meets the stop condition.
Stop condition
Early stopping works for both Tasks and single executions.
Early stopping halts that execution or Task when the condition is met.
Set up early stopping
Create a new execution or Task
Scroll to the Runtime section
Check the box Set stop condition
Define your rule:
Metadata key: The metric to monitor (e.g.,
accuracy,val_loss)Operator:
>,<,=,>=,<=Value: The threshold (e.g.,
0.95)
Start your execution or Task
Example: Stop when accuracy exceeds 0.95
Metadata key: accuracy
Operator: >
Value: 0.95
Valohai monitors the accuracy metadata. Once it exceeds 0.95, the execution stops.
Example: Stop when loss drops below threshold
Metadata key: val_loss
Operator: <
Value: 0.1
The execution stops as soon as validation loss falls below 0.1.
Use cases
Cost optimization: Stop training once your model hits the target metric—no need to run all planned epochs.
Fast failure: Halt executions that show poor performance early (e.g., loss is increasing instead of decreasing).
Hyperparameter search: In a Grid Search or Bayesian Task, stop the entire Task once one execution finds the optimal configuration.
Example: Stopping a Grid Search Task
You're running a Grid Search with 50 executions to find the best learning rate.
Goal: Stop the Task when any execution achieves accuracy > 0.98.
Setup:
Create the Grid Search Task
Set stop condition:
accuracy > 0.98Choose: Stop all executions
Once one execution hits 0.98 accuracy, Valohai stops the entire Task—saving compute on the remaining executions.
Next steps
Logging metadata for early stopping
Task Blueprints to define early stopping rules in YAML
Bayesian Optimization for intelligent hyperparameter search
Grid Search for exhaustive parameter exploration
Last updated
Was this helpful?
