Queue Priority
Run time-sensitive executions ahead of the queue using high priority scheduling
Queue priority lets you run urgent executions immediately instead of waiting for the entire queue to clear.
Use this when you need to:
Deploy a model update to production right away
Re-run a failed critical experiment
Process time-sensitive inference requests
Without priority, all executions run first-in, first-out (FIFO). With priority enabled, you create two separate queues: high priority executions run before normal priority ones.
How It Works
When you create an execution with high priority:
It's placed at the end of the high-priority queue
All high-priority executions run before any normal-priority executions
Within each priority level, executions still run FIFO
Important: Running executions are never interrupted. If normal-priority jobs are already running when you queue a high-priority execution, those running jobs will complete first.
Enable Priority at Creation
Web App
Toggle High Priority next to the Create Execution button when starting a new execution.

Command Line
Add the --priority flag:
# Run the Step named train-model with high priority
vh exec run train-model --priorityChange Priority While Queued
You can upgrade (or downgrade) an execution's priority while it's waiting in the queue.
When you change priority:
The execution is removed from its current position
It's rescheduled at the end of the new priority queue
This is useful when priorities shift — for example, if a stakeholder flags an experiment as urgent after it's already been queued.
Availability
Queue priority works with:
On-premises execution environments
Not supported:
Cloud virtual machine environments
Kubernetes environments
Slurm environments
If you try to set priority on an unsupported environment via CLI or API, you'll receive an error response. The priority toggle in the web app only appears for supported environments.
When to Use Priority
✅ Best For
Dedicated, fixed-capacity environments where you have:
A constant pool of workers (e.g., 10 GPUs always running)
Mixed workloads (urgent production jobs + background experiments)
Predictable queue times
Priority helps you process critical work immediately while batching lower-priority jobs during off-hours.
⚠️ Less For
Auto-scaling environments that spin up capacity on-demand.
These environments already handle bursts by adding workers automatically, so the entire queue gets processed quickly regardless of priority. You can still use priority for organizational clarity, but it won't dramatically change execution times.
Performance Considerations
Priority queueing requires the queue service to do more work when scheduling executions.
For most use cases, this overhead is negligible. However, very high-capacity environments (hundreds of simultaneous executions) may experience slightly slower queue scheduling or need additional resources allocated to queue services.
This is a theoretical concern for most deployments, contact Valohai support if you're running at extreme scale and notice queue performance issues.
Last updated
Was this helpful?
