Environment Splitting

Best practices for separating development and production environments in Valohai

Separating development and production environments is no longer just a best practice, it's a necessity for machine learning and AI development.

There are many reasons to split your environments, from ensuring the integrity of production data to maintaining access control and resource segregation.

Why Separate Dev and Prod?

Maintain Separate Environments

Separating development and production environments is crucial for several reasons, including cost tracking, access control, and resource isolation.

Benefits:

Safeguard your production environment from unintended changes during development
Enable tracking of development costs versus production expenses
Prevent accidental data mixing between environments

Access Control and Data Segregation

Access control is a top priority in machine learning projects. Separating dev and prod environments helps prevent unauthorized access and data breaches.

What it ensures:

Production pipelines are not accidentally launched with development data and vice versa
Only authorized team members can promote data, models, and code to production
Only authorized users can launch and schedule production pipelines
Dev and test environments remain isolated with no visibility into each other's data or results
Essential data integrity and security is maintained

Different Resources for Different Needs

Each environment, whether for development or production, may have distinct resource requirements, including machine types, networking configurations, and access rights.

Production environment examples:

Require workloads to use approved Docker images with thorough vulnerability scanning
Implement a separate virtual network with access to production data, inaccessible from development environments
Use different instance types (e.g., larger, more reliable instances for production)
Apply stricter security policies and compliance requirements

These measures ensure the robustness and security of your production pipeline.

What to Consider?

Before implementing the separation of your development and production environments in Valohai, review this checklist. Not all sections may apply to your use case, but it's recommended to review each point.

Accounts and Resource Groups

Cloud accounts:

Create separate accounts, subscriptions, or resource groups for development and production
Restrict access to approved base Docker images
Ensure accurate and segregated cost tracking

Kubernetes clusters:

Use different clusters for deployment, potentially within distinct namespaces or entirely separate clusters
Consider separate node pools for dev and prod workloads

User Access

Access management:

Define a distinct set of users for different environments
Implement access controls to enforce these restrictions
Use role-based access control (RBAC) to manage permissions

Authentication:

Consider separate identity providers for production
Implement stricter authentication requirements for production (e.g., MFA)

Data Storage

Storage strategy:

Consider separate data stores for different stages
Define some resources as read-only in one environment and read-write in another

Data promotion:

Decide whether to promote data and models between stages
OR generate new datasets and models in the new environment

Valohai datasets:

Determine if you need to share Valohai datasets and aliases between environments
Consider using different storage buckets for each environment

Version Control

Branch strategy:

Decide which branches to pull from in different stages
Limit production projects to pulling from the main branch only
Use feature branches for development environments

Code promotion:

Implement Git branch protection rules
Ensure code changes are reviewed and approved before merging to main
Consider using tags or releases for production deployments

API Key Management

Key separation:

Manage API keys separately for production and development
Consider implementing key rotation policies in different environments
Use different service accounts for each environment

Security:

Store production API keys in secure vaults
Limit production key access to authorized personnel only
Audit API key usage regularly

Quota Management

Resource limits:

Evaluate quota management if your environments reside in the same account
Set quotas to prevent development from consuming production resources
Monitor resource usage across environments

Cost management:

Implement budgets and alerts for each environment
Track spending separately for dev and prod
Consider auto-shutdown policies for development environments

Implementation Patterns in Valohai

Pattern 1: Separate Cloud Accounts

Setup:

Development: Separate AWS/Azure/GCP account
Production: Separate AWS/Azure/GCP account

Pros:

Complete isolation
Clear cost separation
Different billing
Independent quota management

Cons:

More accounts to manage
Potential duplication of resources
Cross-account data transfer may incur costs

Best for: Organizations with strict compliance requirements or large teams

Pattern 2: Separate Resource Groups

Primarily intended for for Azure users.

Setup:

Development: Resource group in shared account
Production: Resource group in shared account

Pros:

Easier management than separate accounts
Cost tracking per resource group
Shared networking possible if needed

Cons:

Requires careful IAM configuration
Shared quotas need management
Less isolation than separate accounts

Best for: Medium-sized teams with moderate security requirements

Pattern 3: Separate Kubernetes Namespaces

Primarily intended for Kubernetes users.

Setup:

Development: Namespace in shared cluster
Production: Namespace in shared cluster (or separate cluster)

Pros:

Simple to set up
Resource quotas per namespace
Network policies for isolation

Cons:

Shared cluster resources
Potential for misconfiguration
Less isolation than separate clusters

Best for: Teams already using Kubernetes extensively

Pattern 4: Separate Valohai Organizations

Setup:

Development: Valohai organization
Production: Valohai organization

Pros:

Complete logical separation in Valohai including available data stores, datasets, registries, and environments
Independent user management
Clear project boundaries

Cons:

Requires coordination between organizations
Potential duplication of projects and configurations
Separate billing if using Valohai's managed service

Best for: Large enterprises with multiple teams

Example: AWS Implementation

Here's an example of how to implement environment splitting in AWS:

Development Environment

Account/Resources:

AWS Account: dev-ml-account
VPC: valohai-dev-vpc
S3 Bucket: valohai-dev-artifacts
IAM Roles: ValohaiWorkerRole-dev

Configuration:

Instance types: Smaller, cost-effective instances
Spot instances: Enabled for cost savings
Auto-shutdown: After 2 hours of inactivity
Git branch: develop or feature branches

Production Environment

Account/Resources:

AWS Account: prod-ml-account
VPC: valohai-prod-vpc
S3 Bucket: valohai-prod-artifacts
IAM Roles: ValohaiWorkerRole-prod

Configuration:

Instance types: Reliable on-demand instances
Spot instances: Disabled or limited
Auto-shutdown: Disabled
Git branch: main only
Additional: VPC peering to production databases

Data Flow

Development to Production:

Data scientists develop and test in dev environment
Code is committed to feature branch
Pull request created to merge to main
Code review and approval required
After merge, production pipeline can be triggered
Production uses separate, validated datasets
Models are evaluated in production environment
Approved models deployed to production inference

Getting Help

Valohai Support: [email protected]

For help with:

Architecting your environment separation
Configuring access controls
Setting up promotion workflows
Implementing best practices for your organization

Consultation:

Valohai can provide consultation on environment architecture for your specific needs, including compliance requirements, team structure, and budget considerations.

PreviousAdvanced Topics NextShared Cache

Last updated 14 days ago

Was this helpful?