Let Data Scientists Be Scientists
Data scientists shouldn't wrestle with cloud permissions, network configurations, or storage mounts. That's infrastructure work, not science.
Valohai inverts the traditional model: your ops team owns the environments, your data scientists own the experiments. Whether you're on AWS, Azure, GCP, or on-premises, Valohai abstracts these details away.
The Problem with Traditional Platforms
Most ML platforms force data scientists to become part-time DevOps engineers. They demand cloud credentials, network configurations, and security policies before you can run a single experiment.
This approach breaks down because:
Data scientists waste time on infrastructure instead of model improvement
Security risks multiply when everyone needs cloud access
Onboarding takes weeks instead of hours
How Valohai Works Differently
Zero Cloud Credentials for Data Scientists
Your data science team never touches:
Cloud provider CLIs or authentication tokens
Virtual networks, subnets, or firewall rules
Identity management or permission policies
Storage bucket configurations or access keys
Instead, they select pre-configured environments and run experiments.
Clear Separation of Concerns
Infrastructure Team Handles:
Environment setup and maintenance
Cloud resource provisioning
Security policies and access controls
Cost optimization and monitoring
Data Science Team Focuses On:
Experiment design and execution
Model architecture and hyperparameters
Data preprocessing and feature engineering
Results analysis and iteration
The Outcome
This separation delivers concrete benefits:
Faster onboarding: New team members run experiments on day one
Better security: Cloud credentials stay with the ops team
Higher productivity: Data scientists spend 100% of their time on ML problems
Controlled costs: Centralized environment management prevents resource sprawl
Implementation in Practice
Here's how a data scientist runs an experiment:
vh execution run --environment production-gpuBehind that simple command, Valohai handles:
Provisioning the right compute instance
Mounting data stores with proper credentials
Injecting secrets and configuration
Setting up monitoring and logging
The data scientist sees none of this complexity. They get results, not infrastructure headaches.
When to Use This Pattern
This approach works best when:
Your team has dedicated infrastructure/platform engineers
Security and compliance matter
You want to scale beyond a handful of researchers
Cloud costs need active management
Last updated
Was this helpful?
