Valohai “master” |
Most users don’t need to focus on this component. It is a dedicated environment—such as app.valohai.com—where users log in to the platform. This environment includes the web application, auto-scaling services, API endpoints, user management capabilities, and contextual information about projects and executed jobs. It can also be self-hosted within your own infrastructure. |
Compute & Data |
Generally, each machine learning job scheduled through Valohai runs in your chosen cloud or on-premises environment. Valohai does not directly access the machines that run the code; instead, the code is retrieved from your data store and executed on your selected machine. |
The worker environment |
This is a virtual machine, Kubernetes node, or an on-premises machine that can run Valohai jobs. It can be any Linux-based “machine” associated with one or more “queues” (e.g., “High Memory CPU” or “2xA100 Queue”). When a user schedules a job to a specific queue, Valohai orchestrates the process so that an available machine picks it up. If all machines are busy, the job remains queued until a machine becomes available. |
Valohai Agent (peon) |
The Valohai agent, also known as “peon,” is responsible for retrieving jobs from designated job queues and executing them on the server where it runs. It handles downloading the relevant code, data, and Docker image, and it manages uploading generated files as well as posting live output back to Valohai so users can monitor progress in the interface. |