Team Quotas

Team quotas control which teams can access specific machine types and how many machines they can run simultaneously. This lets you allocate expensive GPU resources to production teams while giving development teams access to cheaper instances.

Set Team Quotas

  1. Click Hi, <username> in the top-right corner

  2. Select Manage <organization>

  3. Open the Environments tab

  4. Click Manage team quotas

  5. Select an environment from the dropdown

  6. Select a team from the dropdown

  7. Set the quota value (number of concurrent machines)

  8. Click Add

The quota applies immediately. Users in that team can now run up to the specified number of machines in parallel on that environment.

How Quotas Work

Quota = Maximum concurrent machines a team can use on a specific environment.

Example: The "data-science" team has a quota of 3 on aws-gpu-v100. They can run 3 executions simultaneously on V100 machines. The 4th execution queues until a machine becomes available.

No quota set: Team has no access to that environment (same as quota = 0).

Quota = 0: Explicitly blocks team from using that environment.

Common Quota Patterns

Separate Production and Development

Production team:

aws-cpu-production: Quota = 5
aws-gpu-a100-production: Quota = 2
aws-spot-instances: Quota = 0 (blocked)

Development team:

aws-cpu-dev: Quota = 10
aws-gpu-v100-dev: Quota = 5
aws-spot-instances: Quota = 20

Production gets stable machines, development gets spot instances for cost savings.

Budget-Based Allocation

High-priority team:

aws-p4d-24xlarge: Quota = 4 ($128/hour max spend)

Experimental team:

aws-p4d-24xlarge: Quota = 1 ($32/hour max spend)

Allocate expensive resources proportionally to team priorities.

Fair Resource Sharing

Team A, B, and C:

aws-gpu-v100: Quota = 3 (each team)

Total organization has 9 V100 machines available. Each team gets equal access.

Specialized Environments

ML Research team:

aws-gpu-a100: Quota = 8 (for large language models)
aws-cpu-small: Quota = 2 (limited CPU access)

Data Engineering team:

aws-cpu-large: Quota = 10 (for data processing)
aws-gpu-a100: Quota = 0 (no GPU access)

Teams get optimized access for their workflows.

Block Team Access

Set quota to 0 to prevent a team from using an environment:

  1. Go to Manage team quotas

  2. Select environment and team

  3. Set quota to 0

  4. Click Add

Users in that team will see "No available environments" or the environment won't appear in their dropdown when they try to launch executions.

Use cases:

  • Restrict expensive machines to senior engineers

  • Prevent staging teams from accessing production environments

  • Block teams from deprecated infrastructure

Quota vs Environment Max Scale

Team quotas and environment max scale work together:

Environment Max Scale: Total machines Valohai can launch across all teams.

Team Quotas: Per-team limits within that total.

Example:

Environment: aws-gpu-v100
Max Scale: 10 (organization-wide limit)

Team A Quota: 4
Team B Quota: 4
Team C Quota: 4

Even though quotas sum to 12, the environment max scale of 10 limits total concurrent machines. Teams compete for the 10 available slots based on their quotas.

Monitor Quota Usage

Track how teams use their quotas:

Productivity Dashboard: See which teams consume the most resources over time.

Current executions: Check the executions list to see which teams are currently running jobs.

Hardware statistics: Review utilization to identify if quotas are too high or too low.

See Productivity Dashboard and Track Underutilization for usage analysis.

Adjust Quotas Over Time

Review and adjust quotas based on usage patterns:

Increase quotas when:

  • Teams consistently queue due to quota limits

  • Projects scale up and need more parallelism

  • Budgets increase

Decrease quotas when:

  • Teams underutilize their allocation

  • Costs need reduction

  • Other teams need more capacity

Quarterly review: Check quota usage every quarter and rebalance across teams.

Per-User vs Team Quotas

Team quotas: Limit concurrent machines for the entire team.

Per-user quotas: Limit concurrent machines for each individual user (set in environment settings).

Use both for fine-grained control:

Environment: aws-gpu-a100
Max Scale: 10
Per-User Quota: 2

Team ML-Research: Quota = 6
Team ML-Production: Quota = 4

The ML-Research team can use 6 machines total, but no individual user can monopolize more than 2 machines. This prevents one person from consuming all team capacity.

See Configure Environments for per-user quota setup.

Troubleshooting

User Can't Select Environment

Cause: Team has no quota set for that environment.

Fix: Add a quota for the team:

  1. Go to Manage team quotas

  2. Select environment and team

  3. Set quota > 0

  4. Click Add

Execution Stuck in Queue

Possible causes:

  1. Team quota exhausted (check current team executions)

  2. Environment max scale reached (check organization-wide executions)

  3. Cloud provider quota exceeded (check cloud console)

Fix: Wait for running executions to complete, or increase quotas if appropriate.

Team Hitting Quota Too Often

Cause: Quota too low for team's workload.

Fix: Analyze usage patterns and increase quota:

  1. Check how often team hits quota limits

  2. Increase quota proportionally

  3. Monitor for 1-2 weeks and adjust again if needed

Example Quota Configuration

Small Organization (2 teams, 15 people)

Environment: aws-cpu-medium (Max Scale: 10)
- Team dev-team: Quota = 6
- Team ml-team: Quota = 4

Environment: aws-gpu-v100 (Max Scale: 4)
- Team dev-team: Quota = 2
- Team ml-team: Quota = 2

Simple equal-ish distribution with slight priority to development team for CPU.

Large Organization (5 teams, 50 people)

Environment: aws-spot-cpu (Max Scale: 50)
- Team research: Quota = 20
- Team engineering: Quota = 15
- Team analytics: Quota = 10
- Team ops: Quota = 5
- Team interns: Quota = 5

Environment: aws-gpu-a100 (Max Scale: 10)
- Team research: Quota = 5
- Team engineering: Quota = 3
- Team analytics: Quota = 0 (blocked)
- Team ops: Quota = 2
- Team interns: Quota = 0 (blocked)

Tiered access with priority based on team needs. Analytics and interns blocked from expensive GPUs.

Last updated

Was this helpful?