Let Data Scientists Be Scientists

Data scientists shouldn't wrestle with cloud permissions, network configurations, or storage mounts. That's infrastructure work, not science.

Valohai inverts the traditional model: your ops team owns the environments, your data scientists own the experiments. Whether you're on AWS, Azure, GCP, or on-premises, Valohai abstracts these details away.

The Problem with Traditional Platforms

Most ML platforms force data scientists to become part-time DevOps engineers. They demand cloud credentials, network configurations, and security policies before you can run a single experiment.

This approach breaks down because:

Data scientists waste time on infrastructure instead of model improvement
Security risks multiply when everyone needs cloud access
Onboarding takes weeks instead of hours

How Valohai Works Differently

Zero Cloud Credentials for Data Scientists

Your data science team never touches:

Cloud provider CLIs or authentication tokens
Virtual networks, subnets, or firewall rules
Identity management or permission policies
Storage bucket configurations or access keys

Instead, they select pre-configured environments and run experiments.

Clear Separation of Concerns

Infrastructure Team Handles:

Environment setup and maintenance
Cloud resource provisioning
Security policies and access controls
Cost optimization and monitoring

Data Science Team Focuses On:

Experiment design and execution
Model architecture and hyperparameters
Data preprocessing and feature engineering
Results analysis and iteration

The Outcome

This separation delivers concrete benefits:

Faster onboarding: New team members run experiments on day one
Better security: Cloud credentials stay with the ops team
Higher productivity: Data scientists spend 100% of their time on ML problems
Controlled costs: Centralized environment management prevents resource sprawl

Implementation in Practice

Here's how a data scientist runs an experiment:

vh execution run --environment production-gpu

Behind that simple command, Valohai handles:

Provisioning the right compute instance
Mounting data stores with proper credentials
Injecting secrets and configuration
Setting up monitoring and logging

The data scientist sees none of this complexity. They get results, not infrastructure headaches.

When to Use This Pattern

This approach works best when:

Your team has dedicated infrastructure/platform engineers
Security and compliance matter
You want to scale beyond a handful of researchers
Cloud costs need active management

PreviousUnifying Your ML Infra NextGetting Started

Last updated 14 days ago

Was this helpful?