Interactive hyperparameter optimization can significantly expedite and streamline the hyperparameter tuning process compared to methods like random search or exhaustive grid search.
Valohai supports two notable hyperparameter optimization frameworks for this purpose:
- Optuna: A comprehensive hyperparameter optimization framework.
- Hyperopt (Tree Parzen Estimator - TPE): Utilizes the Tree Parzen Estimator algorithm for optimization.
These frameworks leverage the hyperparameters and outputs of previous executions to suggest optimal hyperparameter settings for future executions, effectively learning and adapting based on past results.
Here’s a high-level overview of how Bayesian optimization, including TPE, operates under the hood:
- Create Startup Executions: Initial executions are generated using random search to explore a broad range of hyperparameter values.
- Model Relationship: Based on these initial executions, a simplified function is created to model the relationship between hyperparameters and the target metric value (e.g., “loss”).
- Optimize Hyperparameters: Using this simplified model, the optimization algorithm seeks to find the optimal hyperparameter values that bring the target metric as close as possible to the desired value.
- Iterate: The process repeats, with each iteration refining the model and proposing new sets of hyperparameter values.
This iterative approach allows the optimization algorithm to learn and adapt over time, ultimately leading to more efficient and effective hyperparameter tuning.
Create a Bayesian optimization task
What do you need?
To create a Bayesian optimization task you’ll need to have atleast one step with parameters defined in your valohai.yaml
You can also use the web app to launch a task based on an completed execution.
Follow these steps, to create a Bayesian optimization task in Valohai:
- Go to your Valohai Project and navigate to the Task page.
- Click on “Create task” to initiate the task creation process.
- Choose the step in your project where you have defined the parameters you want to optimize using Bayesian optimization.
- Scroll down to the “Parameters” section.
- In the “Task type” dropdown, select “Bayesian optimization” as the Task type.
- Configure the following settings based on your preferences:
Setting | Description |
---|---|
Early stopping | This feature enables you to set early stopping criteria based on metadata printed by each execution in the Task. When one of the executions from the Task meets this criteria, the entire Task will be stopped. |
Optimization engine | By default, Valohai uses Optuna as the optimization engine, but you also have the option to choose HyperOpt. |
Maximum execution count | Define how many executions Valohai can launch in total for this Task. It is recommended to have an execution count of over 30 for effective optimization. |
Execution batch size | Valohai runs executions in batches. Specify how many executions should be run in a single batch. |
Optimization target metric | Specify the Distribution metadata metric you want to optimize during the Bayesian optimization process. |
Optimization target value | Set the target value for your chosen Distribution metadata metric. |
After configuring these settings according to your requirements, click on “Create task.”
Valohai will then initiate the Bayesian optimization task using the specified settings, allowing you to efficiently tune your model’s hyperparameters.
Use Bayesian optimization only when the execution count is over 30
We recommend using Bayesian optimization when creating more than 30 executions to ensure the optimizer has enough base values to use TPE effectively.
Valohai follows the Hyperopt recommendation and executes the first 20 runs with Random Search before using the TPE to ensure the best results. If you have less than 20 runs, the executions will be based on Random Search instead of TPE optimization.
Exploring Quantization Step (Q.Step)
The Q.Step
parameter helps in adjusting the detail level of your hyperparameter search.
By setting Q.Step
, Valohai finetunes hyperparameter values, rounding them to the nearest specified interval.
This rounding process is key for managing the search space efficiently, especially when dealing with hyperparameters that change in fixed increments (like the number of layers in a neural network).
For instance, setting Q.Step to 1
means the search will consider only whole number values, streamlining the tuning process by focusing on distinct, evenly spaced points within the parameter range.
Logarithmic Search Parameters
For logarithmic search parameters like those in the Logarithmic Uniform
distribution, it’s essential to understand the interpretation of min
and max
:
- Numeric Limits vs. Exponents: Unlike linear ranges,
min
andmax
in logarithmic parameters act as exponents. This means the search explores values from 10min to 10max. This method is invaluable for hyperparameters that span several orders of magnitude, such as learning rates or regularization coefficients.