Valohai offers versatile deployment options, including cloud and on-premise environments. It supports multi-cloud and hybrid setups, and a fully self-hosted installation.
When using Valohai you can be sure that:
- data remains within your environment
- machine learning jobs run on your virtual machines or kubernetes cluster
- models are deployed to your object stores and registries
Why Valohai?
Valohai simplifies infrastructure management, experiment tracking, and pipeline orchestration, allowing data scientists to focus on building custom machine learning models.
Components
The Valohai application doesn’t directly access your workers but instead writes requests and updates to the job queue machine, from where your own workers find jobs that are scheduled for them.Valohai comprises of two layers, the Application Layer and the Compute & Data Layer.
Application Layer
Hosted by Valohai
- The web application
- A PostgreSQL database for storing user and execution metadata information (who ran what, and when)
- A scaling service responsible for managing virtual machine instances
- A deployment image builder for building and publishing images for online-inference
- The API services that enable you to:
- Launch, view, and manage jobs, pipelines, and deployments
- Access execution results, files, and metrics
- Launch batch processing jobs
- …more Valohai APIs
Data & Compute Layer
Hosted in your cloud/on-premises
- A virtual machine that manages the machine learning job queue
- Object storage(s) to store job logs, snapshots, and generated files (e.g. models, dataset snapshots)
- Autoscaled virtual machines to run machine learning jobs
- Other optional services, like:
- Your existing Kubernetes cluster, to enable teams to deploy models for real-time inference
- Private Docker Registry
- Other data sources (databases, data warehouses, etc.)
- A Spark cluster
Self-Hosted Valohai
Need to run Valohai completely inside your own network? See the guide Deploy a self-hosted Valohai