Bayesian Optimization

Bayesian Optimization intelligently explores the hyperparameter space by learning from previous executions.

Instead of testing every combination (Grid Search) or sampling randomly (Random Search), Bayesian algorithms like Optuna and HyperOpt suggest promising hyperparameters based on past results, making optimization faster and more efficient.

How it works

Bayesian Optimization follows this loop:

  1. Initial exploration: Run a batch of executions with random hyperparameters to explore the space

  2. Model the relationship: Build a probabilistic model connecting hyperparameters to your target metric (e.g., accuracy, loss)

  3. Suggest next values: Use the model to identify promising hyperparameter combinations

  4. Iterate: Run new executions with suggested values, update the model, and repeat

This approach converges on optimal hyperparameters faster than exhaustive or random search—especially valuable when each execution is expensive (e.g., training large models).

Before you start

You need:

  • At least one step with parameters defined in your valohai.yaml

  • A metadata metric that your training code logs (e.g., accuracy, val_loss)

  • At least 30 planned executions for effective optimization

⚠️ Important: Valohai runs the first 20 executions with Random Search to seed the optimizer. If you create fewer than 20 executions, the Task will use Random Search instead of Bayesian Optimization.

Create a Bayesian Optimization Task

  1. Open your project in Valohai

  2. Go to the Tasks tab

  3. Click Create Task

  4. Select the step with your parameters

  5. Scroll to Parameters

  6. Select Bayesian optimization as the Task type

  7. Configure the settings:

Setting

Description

Early stopping

Stop the Task when an execution meets a metadata condition (e.g., accuracy > 0.95)

Optimization engine

Choose Optuna (default) or HyperOpt

Maximum execution count

Total executions to run. Recommend 30+ for effective optimization.

Execution batch size

How many executions run in parallel before the next batch starts

Optimization target metric

The metadata key to optimize (must match what your code logs)

Optimization target value

The goal value (minimize or maximize depends on the metric)

  1. For each parameter, define:

    • Distribution type (Uniform, Log Uniform, Integer, Categorical)

    • Min and max values (or categories for categorical parameters)

    • Q.Step (optional, see below)

  2. Click Create task

Valohai will start with random exploration, then use Bayesian optimization to suggest hyperparameters for subsequent batches.

Understanding Q.Step

Q.Step rounds parameter values to the nearest interval, reducing the search space.

Example: If you're optimizing the number of layers in a neural network (which must be an integer), set Q.Step to 1. The optimizer will only suggest whole numbers like 3, 4, 5 instead of 3.7, 4.2.

This is useful for:

  • Discrete parameters: Integers, specific increments

  • Reducing search space: When only certain values make sense (e.g., batch sizes that are powers of 2)

Logarithmic parameter distributions

For parameters that span multiple orders of magnitude (like learning rates), use Logarithmic Uniform distribution.

In this case, min and max are exponents:

  • min = -5 and max = -1 means the optimizer explores values from 10^-5 to 10^-1 (0.00001 to 0.1)

This gives equal attention to small and large values, which is critical for hyperparameters like learning rates or regularization coefficients.

Example: Bayesian Optimization Task

Goal: Optimize a model to maximize validation accuracy

Configuration:

  • Optimization target metric: val_accuracy

  • Optimization target value: 1.0 (maximize)

  • Maximum execution count: 50

  • Execution batch size: 5

Parameters:

Parameter

Distribution

Min

Max

Q.Step

learning_rate

Log Uniform

-5

-1

batch_size

Integer

16

128

16

num_layers

Integer

2

10

1

Valohai will run 20 random executions first, then use Bayesian optimization to suggest learning rates, batch sizes, and layer counts for the remaining 30 executions.

When to use Bayesian Optimization

Use it when:

  • Each execution is expensive (long training times, large models)

  • You have 30+ executions to run

  • You're optimizing 2-10 hyperparameters

Skip it when:

  • You need quick results with fewer than 30 executions (use Random Search)

  • Your parameter space is small enough for Grid Search

  • You have specific parameter combinations in mind (use Manual Search)

Next steps

Last updated

Was this helpful?