Let Data Scientists Be Scientists

Data scientists shouldn't wrestle with cloud permissions, network configurations, or storage mounts. That's infrastructure work, not science.

Valohai inverts the traditional model: your ops team owns the environments, your data scientists own the experiments. Whether you're on AWS, Azure, GCP, or on-premises, Valohai abstracts these details away.

The Problem with Traditional Platforms

Most ML platforms force data scientists to become part-time DevOps engineers. They demand cloud credentials, network configurations, and security policies before you can run a single experiment.

This approach breaks down because:

  • Data scientists waste time on infrastructure instead of model improvement

  • Security risks multiply when everyone needs cloud access

  • Onboarding takes weeks instead of hours

How Valohai Works Differently

Zero Cloud Credentials for Data Scientists

Your data science team never touches:

  • Cloud provider CLIs or authentication tokens

  • Virtual networks, subnets, or firewall rules

  • Identity management or permission policies

  • Storage bucket configurations or access keys

Instead, they select pre-configured environments and run experiments.

Clear Separation of Concerns

Infrastructure Team Handles:

  • Environment setup and maintenance

  • Cloud resource provisioning

  • Security policies and access controls

  • Cost optimization and monitoring

Data Science Team Focuses On:

  • Experiment design and execution

  • Model architecture and hyperparameters

  • Data preprocessing and feature engineering

  • Results analysis and iteration

The Outcome

This separation delivers concrete benefits:

  • Faster onboarding: New team members run experiments on day one

  • Better security: Cloud credentials stay with the ops team

  • Higher productivity: Data scientists spend 100% of their time on ML problems

  • Controlled costs: Centralized environment management prevents resource sprawl

Implementation in Practice

Here's how a data scientist runs an experiment:

vh execution run --environment production-gpu

Behind that simple command, Valohai handles:

  • Provisioning the right compute instance

  • Mounting data stores with proper credentials

  • Injecting secrets and configuration

  • Setting up monitoring and logging

The data scientist sees none of this complexity. They get results, not infrastructure headaches.

When to Use This Pattern

This approach works best when:

  • Your team has dedicated infrastructure/platform engineers

  • Security and compliance matter

  • You want to scale beyond a handful of researchers

  • Cloud costs need active management

Last updated

Was this helpful?