Valohai “master” |
Most users don’t have to worry about this. It’s an dedidacted environment, like app.valohai.com, where users login to the platform. This enviornment includes the web application, auto-scaling services, API services, user management, and hosts contextual information about the projects and jobs executed in the environment. This can be also self-hosted inside your own environment. |
Compute & Data |
Generally speaking, each machine learning job scheduled from Valohai gets executed in your cloud or on-premises environment. Valohai doesn’t have direct access to the machines that execute code, and code is downloaded from your data store to your machine. |
The worker environment |
A virtual machine, Kubernetes node, or a on-premises machine that is available to run Valohai jobs. This can be any “machine” that’s running Linux. Each worker belongs to one or multiple “queues”, for example “High Memory CPU” or “2xA100 Queue”. The user schedules a job to a queue and Valohai will orchestrate the right available machine to pick the job up, or if all machines are busy, Valohai will keep the job queued until a machine is freed up. |
Valohai Agent (peon) |
The Valohai agent is responsible for picking up jobs from the designated job queue(s) and execute them on the server that it’s running on. Peon is responsible for downloading the relevant code, data, Docker image, and it’s responsible for uploading generated files and posting live outputs back to Valohai so the user can see them in the user interface. |