Batch Inference
Batch inference in Valohai runs as a standard execution, letting you process datasets or file collections at scale without managing infrastructure.
How it works
Batch inference uses the same execution system you use for training:
Define a step in
valohai.yamlwith your inference codeSpecify inputs (model files and data to process)
Run the execution via CLI, API, or schedule it
Collect results from outputs
Key advantage: You already know this system. If you've run training jobs, you can run inference jobs.
What you can do
Process thousands of images, CSVs, or other file types
Schedule recurring inference jobs (e.g., nightly predictions)
Trigger inference via API when new data arrives
Chain inference into pipelines after training completes
Track inference metrics alongside training metrics
Example use cases
Image classification at scale Process a directory of product images to tag inventory items.
Batch predictions on tabular data Run monthly churn predictions on your entire customer database.
Document processing Extract entities from legal documents or medical records in batches.
When to use batch inference
Choose batch inference when:
You're processing datasets, not individual requests
Latency requirements are in minutes or hours, not milliseconds
You want to leverage Valohai's execution tracking and versioning
You need to schedule or automate inference runs
Need lower latency? Check out Real-Time Endpoints for sub-second predictions.
Next steps
See practical examples:
Or jump straight to defining your inference step in valohai.yaml.
Last updated
Was this helpful?
