Queue Priority

Run time-sensitive executions ahead of the queue using high priority scheduling

Queue priority lets you run urgent executions immediately instead of waiting for the entire queue to clear.

Use this when you need to:

  • Deploy a model update to production right away

  • Re-run a failed critical experiment

  • Process time-sensitive inference requests

Without priority, all executions run first-in, first-out (FIFO). With priority enabled, you create two separate queues: high priority executions run before normal priority ones.


How It Works

When you create an execution with high priority:

  1. It's placed at the end of the high-priority queue

  2. All high-priority executions run before any normal-priority executions

  3. Within each priority level, executions still run FIFO

Important: Running executions are never interrupted. If normal-priority jobs are already running when you queue a high-priority execution, those running jobs will complete first.


Enable Priority at Creation

Web App

Toggle High Priority next to the Create Execution button when starting a new execution.

Command Line

Add the --priority flag:

# Run the Step named train-model with high priority
vh exec run train-model --priority

Change Priority While Queued

You can upgrade (or downgrade) an execution's priority while it's waiting in the queue.

When you change priority:

  • The execution is removed from its current position

  • It's rescheduled at the end of the new priority queue

This is useful when priorities shift — for example, if a stakeholder flags an experiment as urgent after it's already been queued.


Availability

Queue priority works with:

  • On-premises execution environments

Not supported:

  • Cloud virtual machine environments

  • Kubernetes environments

  • Slurm environments

If you try to set priority on an unsupported environment via CLI or API, you'll receive an error response. The priority toggle in the web app only appears for supported environments.


When to Use Priority

✅ Best For

Dedicated, fixed-capacity environments where you have:

  • A constant pool of workers (e.g., 10 GPUs always running)

  • Mixed workloads (urgent production jobs + background experiments)

  • Predictable queue times

Priority helps you process critical work immediately while batching lower-priority jobs during off-hours.

⚠️ Less For

Auto-scaling environments that spin up capacity on-demand.

These environments already handle bursts by adding workers automatically, so the entire queue gets processed quickly regardless of priority. You can still use priority for organizational clarity, but it won't dramatically change execution times.


Performance Considerations

Priority queueing requires the queue service to do more work when scheduling executions.

For most use cases, this overhead is negligible. However, very high-capacity environments (hundreds of simultaneous executions) may experience slightly slower queue scheduling or need additional resources allocated to queue services.

This is a theoretical concern for most deployments, contact Valohai support if you're running at extreme scale and notice queue performance issues.

Last updated

Was this helpful?