Getting Started

Valohai is an MLOps platform that handles infrastructure complexity while you build production ML systems. Train models, run experiments, and deploy to production, all without DevOps overhead.

Core Capabilities

Full experiment tracking and lineage

Every run becomes reproducible and auditable.

  • Automatic versioning — Code, data, parameters, and environments captured on every run

  • Metric comparison — Compare runs, spot regressions, track model drift

  • Dataset versioning — Link datasets to experiments without storage duplication

Infrastructure abstraction

Run ML workloads on any compute with one command.

  • Multi-cloud execution — AWS, GCP, Azure, Oracle Cloud Infrastructure, Scaleway, OVH, Slurm, Kubernetes, or on-premises hardware

  • Elastic scaling — Same code runs on 1 GPU or 100 GPUs

  • Production deployment — Batch inference, REST APIs, or streaming endpoints with built-in monitoring

Framework agnostic

Your code, your tools, zero lock-in.

  • Any ML framework — PyTorch, TensorFlow, JAX, Hugging Face, or custom stacks

  • Simple integration — Add a valohai.yaml to any project

  • API-first design — REST API and webhooks for CI/CD pipelines

Who uses Valohai?

Data Scientists & ML Engineers — Focus on model development instead of cloud configurations MLOps Teams — Standardize workflows across projects without forcing tool changes Enterprise ML Teams — Meet compliance requirements with full audit trails and data lineage

Start Building

Resource
Description

Run your first execution in 5 minutes

Interactive learning path with hands-on exercises

Import a working computer vision pipeline

Adapt language models to your domain

💡 First time with MLOps? Start with Valohai Academy for guided tutorials that build from basics to advanced workflows.

Last updated

Was this helpful?