Kubernetes Autoscaling

Configure Kubernetes autoscaling for Valohai workers using Karpenter or other autoscaling solutions

Configure autoscaling for your Kubernetes cluster to dynamically provision resources for Valohai ML workloads.

Overview

Valohai Workers on Kubernetes are implemented as Kubernetes jobs. When you have autoscaling configured, your cluster can automatically:

Scale up nodes when Valohai jobs are queued
Scale down nodes when jobs complete and resources are idle
Select appropriate instance types based on job requirements
Optimize costs by using spot/preemptible instances

Note: This guide uses AWS EKS with Karpenter as an example, but the concepts apply to any Kubernetes cluster. The same principles work with GKE, AKS, or on-premises Kubernetes using different autoscalers.

Autoscaling Options

You can use various autoscaling solutions with Valohai:

Karpenter (Recommended for AWS EKS)

Best for: AWS EKS clusters

Advantages:

Fast node provisioning (seconds vs. minutes)
Flexible instance selection
Bin-packing optimization
Direct EC2 API integration

Cloud support: AWS (native), Azure and GCP (experimental)

Cluster Autoscaler

Best for: Multi-cloud environments, stable workloads

Advantages:

Cloud-agnostic
Mature and widely used
Works with all major cloud providers
Simple configuration

Cloud support: AWS, GCP, Azure, and others

Cloud-Native Autoscalers

GKE Autopilot: Fully managed node provisioning on GKE

AKS Cluster Autoscaler: Azure's native autoscaling

Best for: Organizations standardized on one cloud provider

Example: Karpenter on AWS EKS

This section provides a complete example of setting up Karpenter on AWS EKS. If you're using a different cloud provider or autoscaler, adapt these concepts to your environment.

Requirements

Existing infrastructure:

EKS cluster with Valohai workers installed
AWS CLI installed
kubectl configured

Permissions:

Admin access to your EKS cluster
IAM permissions to create roles and policies

Step 1: Set Up Environment Variables

Define common variables for reuse:

# Check if OIDC is configured
aws iam list-open-id-connect-providers
# Should show: oidc.eks.<region>.amazonaws.com/id/<ID>

export AWS_PROFILE=<aws-profile>
export AWS_REGION=<region>
export KUBECONFIG=~/.kube/<cluster-name>

CLUSTER=<cluster-name>
KARPENTER_NAMESPACE=kube-system
AWS_PARTITION="aws"
OIDC_ENDPOINT="$(aws eks describe-cluster --name ${CLUSTER} --query "cluster.identity.oidc.issuer" --output text)"
AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query 'Account' --output text)

Step 2: Create IAM Roles

Create two IAM roles: one for nodes provisioned by Karpenter and one for the Karpenter controller.

Create node trust policy:

echo '{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Service": "ec2.amazonaws.com"
            },
            "Action": "sts:AssumeRole"
        }
    ]
}' > node-trust-policy.json

Create node role:

aws iam create-role \
  --role-name "KarpenterNodeRole-${CLUSTER}" \
  --assume-role-policy-document file://node-trust-policy.json

aws iam attach-role-policy \
  --role-name "KarpenterNodeRole-${CLUSTER}" \
  --policy-arn arn:${AWS_PARTITION}:iam::aws:policy/AmazonEKSWorkerNodePolicy
  
aws iam attach-role-policy \
  --role-name "KarpenterNodeRole-${CLUSTER}" \
  --policy-arn arn:${AWS_PARTITION}:iam::aws:policy/AmazonEKS_CNI_Policy
  
aws iam attach-role-policy \
  --role-name "KarpenterNodeRole-${CLUSTER}" \
  --policy-arn arn:${AWS_PARTITION}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
  
aws iam attach-role-policy \
  --role-name "KarpenterNodeRole-${CLUSTER}" \
  --policy-arn arn:${AWS_PARTITION}:iam::aws:policy/AmazonSSMManagedInstanceCore

Create controller trust policy:

cat << EOF > controller-trust-policy.json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "Federated": "arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_ENDPOINT#*//}"
            },
            "Action": "sts:AssumeRoleWithWebIdentity",
            "Condition": {
                "StringEquals": {
                    "${OIDC_ENDPOINT#*//}:aud": "sts.amazonaws.com",
                    "${OIDC_ENDPOINT#*//}:sub": "system:serviceaccount:${KARPENTER_NAMESPACE}:karpenter"
                }
            }
        }
    ]
}
EOF

Create controller role:

aws iam create-role \
  --role-name KarpenterControllerRole-${CLUSTER} \
  --assume-role-policy-document file://controller-trust-policy.json

Create controller policy:

cat << EOF > controller-policy.json
{
    "Statement": [
        {
            "Action": [
                "ssm:GetParameter",
                "ec2:DescribeImages",
                "ec2:RunInstances",
                "ec2:DescribeSubnets",
                "ec2:DescribeSecurityGroups",
                "ec2:DescribeLaunchTemplates",
                "ec2:DescribeInstances",
                "ec2:DescribeInstanceTypes",
                "ec2:DescribeInstanceTypeOfferings",
                "ec2:DescribeAvailabilityZones",
                "ec2:DeleteLaunchTemplate",
                "ec2:CreateTags",
                "ec2:CreateLaunchTemplate",
                "ec2:CreateFleet",
                "ec2:DescribeSpotPriceHistory",
                "pricing:GetProducts"
            ],
            "Effect": "Allow",
            "Resource": "*",
            "Sid": "Karpenter"
        },
        {
            "Action": "ec2:TerminateInstances",
            "Condition": {
                "StringLike": {
                    "ec2:ResourceTag/karpenter.sh/nodepool": "*"
                }
            },
            "Effect": "Allow",
            "Resource": "*",
            "Sid": "ConditionalEC2Termination"
        },
        {
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": "arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER}",
            "Sid": "PassNodeIAMRole"
        },
        {
            "Effect": "Allow",
            "Action": "eks:DescribeCluster",
            "Resource": "arn:${AWS_PARTITION}:eks:${AWS_REGION}:${AWS_ACCOUNT_ID}:cluster/${CLUSTER}",
            "Sid": "EKSClusterEndpointLookup"
        },
        {
            "Sid": "AllowScopedInstanceProfileCreationActions",
            "Effect": "Allow",
            "Resource": "*",
            "Action": [
            "iam:CreateInstanceProfile"
            ],
            "Condition": {
            "StringEquals": {
                "aws:RequestTag/kubernetes.io/cluster/${CLUSTER}": "owned",
                "aws:RequestTag/topology.kubernetes.io/region": "${AWS_REGION}"
            },
            "StringLike": {
                "aws:RequestTag/karpenter.k8s.aws/ec2nodeclass": "*"
            }
            }
        },
        {
            "Sid": "AllowScopedInstanceProfileTagActions",
            "Effect": "Allow",
            "Resource": "*",
            "Action": [
            "iam:TagInstanceProfile"
            ],
            "Condition": {
            "StringEquals": {
                "aws:ResourceTag/kubernetes.io/cluster/${CLUSTER}": "owned",
                "aws:ResourceTag/topology.kubernetes.io/region": "${AWS_REGION}",
                "aws:RequestTag/kubernetes.io/cluster/${CLUSTER}": "owned",
                "aws:RequestTag/topology.kubernetes.io/region": "${AWS_REGION}"
            },
            "StringLike": {
                "aws:ResourceTag/karpenter.k8s.aws/ec2nodeclass": "*",
                "aws:RequestTag/karpenter.k8s.aws/ec2nodeclass": "*"
            }
            }
        },
        {
            "Sid": "AllowScopedInstanceProfileActions",
            "Effect": "Allow",
            "Resource": "*",
            "Action": [
            "iam:AddRoleToInstanceProfile",
            "iam:RemoveRoleFromInstanceProfile",
            "iam:DeleteInstanceProfile"
            ],
            "Condition": {
            "StringEquals": {
                "aws:ResourceTag/kubernetes.io/cluster/${CLUSTER}": "owned",
                "aws:ResourceTag/topology.kubernetes.io/region": "${AWS_REGION}"
            },
            "StringLike": {
                "aws:ResourceTag/karpenter.k8s.aws/ec2nodeclass": "*"
            }
            }
        },
        {
            "Sid": "AllowInstanceProfileReadActions",
            "Effect": "Allow",
            "Resource": "*",
            "Action": "iam:GetInstanceProfile"
        }
    ],
    "Version": "2012-10-17"
}
EOF

Attach policy to role:

aws iam put-role-policy \
  --role-name KarpenterControllerRole-${CLUSTER} \
  --policy-name KarpenterControllerPolicy-${CLUSTER} \
  --policy-document file://controller-policy.json

Step 3: Tag Resources

Tag node group subnets and security groups so Karpenter knows which resources to use:

Tag subnets:

for NODEGROUP in $(aws eks list-nodegroups --cluster-name ${CLUSTER} \
    --query 'nodegroups' --output text); do 
    aws ec2 create-tags \
        --tags "Key=karpenter.sh/discovery,Value=${CLUSTER}" \
        --resources $(aws eks describe-nodegroup --cluster-name ${CLUSTER} \
        --nodegroup-name $NODEGROUP --query 'nodegroup.subnets' --output text)
done

Tag security group:

NODEGROUP=$(aws eks list-nodegroups --cluster-name ${CLUSTER} --query 'nodegroups[0]' --output text)

SECURITY_GROUPS=$(aws eks describe-cluster \
  --name ${CLUSTER} \
  --query "cluster.resourcesVpcConfig.clusterSecurityGroupId" \
  --output text)

aws ec2 create-tags \
    --tags "Key=karpenter.sh/discovery,Value=${CLUSTER}" \
    --resources ${SECURITY_GROUPS}

Step 4: Update aws-auth ConfigMap

Allow nodes with the KarpenterNodeRole to join the cluster:

cat << EOF
    - groups:
      - system:bootstrappers
      - system:nodes
      rolearn: arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/KarpenterNodeRole-${CLUSTER}
      username: system:node:{{EC2PrivateDNSName}}
EOF

Add the output to the mapRoles in the aws-auth ConfigMap:

kubectl edit configmap aws-auth -n kube-system

Step 5: Deploy Karpenter

Set Karpenter version:

export KARPENTER_VERSION=v0.33.1

Generate Karpenter manifests:

helm template karpenter oci://public.ecr.aws/karpenter/karpenter \
  --version "${KARPENTER_VERSION}" \
  --namespace "${KARPENTER_NAMESPACE}" \
  --set "settings.clusterName=${CLUSTER}" \
  --set "serviceAccount.annotations.eks\.amazonaws\.com/role-arn=arn:${AWS_PARTITION}:iam::${AWS_ACCOUNT_ID}:role/KarpenterControllerRole-${CLUSTER}" \
  --set controller.resources.requests.cpu=1 \
  --set controller.resources.requests.memory=1Gi \
  --set controller.resources.limits.cpu=1 \
  --set controller.resources.limits.memory=1Gi > karpenter.yaml

Modify affinity rules:

Edit karpenter.yaml to tell Karpenter to run on existing node group nodes:

affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: karpenter.sh/nodepool
          operator: DoesNotExist
      - matchExpressions:
        - key: eks.amazonaws.com/nodegroup
          operator: In
          values:
          - ${NODEGROUP}
  podAntiAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      - topologyKey: kubernetes.io/hostname

Deploy Karpenter CRDs:

kubectl create -f \
    https://raw.githubusercontent.com/aws/karpenter-provider-aws/${KARPENTER_VERSION}/pkg/apis/crds/karpenter.sh_nodepools.yaml
kubectl create -f \
    https://raw.githubusercontent.com/aws/karpenter-provider-aws/${KARPENTER_VERSION}/pkg/apis/crds/karpenter.k8s.aws_ec2nodeclasses.yaml
kubectl create -f \
    https://raw.githubusercontent.com/aws/karpenter-provider-aws/${KARPENTER_VERSION}/pkg/apis/crds/karpenter.sh_nodeclaims.yaml

Deploy Karpenter:

kubectl apply -f karpenter.yaml

Step 6: Create Node Pools

Create node pools for different workload types.

CPU Node Pool:

cat <<EOF | envsubst | kubectl apply -f -
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64"]
        - key: kubernetes.io/os
          operator: In
          values: ["linux"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["spot", "on-demand"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["c", "m", "r"]
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ["2"]
      nodeClassRef:
        name: default
  limits:
    cpu: 100
    memory: 1000Gi
  disruption:
    consolidationPolicy: WhenUnderutilized
    expireAfter: 720h
---
apiVersion: karpenter.k8s.aws/v1beta1
kind: EC2NodeClass
metadata:
  name: default
spec:
  amiFamily: AL2
  role: "KarpenterNodeRole-${CLUSTER}"
  subnetSelectorTerms:
    - tags:
        karpenter.sh/discovery: "${CLUSTER}"
  securityGroupSelectorTerms:
    - tags:
        karpenter.sh/discovery: "${CLUSTER}"
EOF

GPU Node Pool (Optional):

If using GPUs, install the NVIDIA device plugin first:

helm repo add nvdp https://nvidia.github.io/k8s-device-plugin
helm repo update

# Check version and use it in the command below
helm search repo nvdp --devel

helm upgrade --install nvdp nvdp/nvidia-device-plugin \
  --namespace nvidia-device-plugin \
  --create-namespace \
  --version <version>

Create GPU node pool:

cat <<EOF | envsubst | kubectl apply -f -
apiVersion: karpenter.sh/v1beta1
kind: NodePool
metadata:
  name: default-gpu
spec:
  template:
    spec:
      requirements:
        - key: kubernetes.io/arch
          operator: In
          values: ["amd64"]
        - key: kubernetes.io/os
          operator: In
          values: ["linux"]
        - key: karpenter.sh/capacity-type
          operator: In
          values: ["on-demand"]
        - key: karpenter.k8s.aws/instance-category
          operator: In
          values: ["p"]
        - key: karpenter.k8s.aws/instance-generation
          operator: Gt
          values: ["2"]
      nodeClassRef:
        name: default
      taints:
      - key: nvidia.com/gpu
        value: true
        effect: "NoSchedule"
  limits:
    cpu: 100
    memory: 1000Gi
    nvidia.com/gpu: 5
  disruption:
    consolidationPolicy: WhenUnderutilized
    expireAfter: 720h
EOF

Step 7: Monitor Scaling

Follow Karpenter logs to see scaling activity:

kubectl logs -f -n ${KARPENTER_NAMESPACE} -c controller -l app.kubernetes.io/name=karpenter

Test scaling:

Create a Valohai execution and watch Karpenter provision nodes automatically.

Adapting to Other Environments

The concepts above apply to other Kubernetes environments. Here's how to adapt:

Google Cloud (GKE)

Use GKE Cluster Autoscaler:

gcloud container clusters update CLUSTER_NAME \
  --enable-autoscaling \
  --min-nodes=1 \
  --max-nodes=10 \
  --node-pool=default-pool

Or use GKE Autopilot for fully managed node provisioning.

Azure (AKS)

Use AKS Cluster Autoscaler:

az aks update \
  --resource-group RESOURCE_GROUP \
  --name CLUSTER_NAME \
  --enable-cluster-autoscaler \
  --min-count 1 \
  --max-count 10

On-Premises or Custom Kubernetes

Use Kubernetes Cluster Autoscaler:

Install Cluster Autoscaler following Kubernetes documentation.

Configure it to work with your infrastructure provider (vSphere, OpenStack, etc.).

Best Practices

Node Pool Configuration

Separate pools for different workloads:

CPU-intensive: c instance family
Memory-intensive: r instance family
GPU workloads: p or g instance family

Cost optimization:

Use spot/preemptible instances for interruptible workloads
Set appropriate limits to prevent runaway costs
Configure consolidation for efficient resource usage

Resource Requests

Set accurate requests in Valohai:

CPU and memory requests help autoscaler make better decisions
Over-requesting wastes resources
Under-requesting causes scheduling failures

Scaling Parameters

Balance speed and cost:

Fast scale-up for time-sensitive workloads
Gradual scale-down to avoid thrashing
Appropriate consolidation policies

Troubleshooting

Nodes not scaling up

Check Karpenter logs:

kubectl logs -n kube-system -l app.kubernetes.io/name=karpenter

Common issues:

IAM permissions insufficient
No matching node pool for job requirements
Instance type not available in region
Subnet or security group not tagged

Nodes not scaling down

Check disruption settings:

Verify consolidation policy
Check if nodes have workloads preventing disruption
Review expiration settings

Force disruption (careful):

kubectl delete node NODE_NAME

Jobs stuck pending

Describe the pod:

kubectl describe pod POD_NAME -n valohai-workers

Check events:

kubectl get events -n valohai-workers --sort-by='.lastTimestamp'

Common issues:

Resource requests too large
No node pool matches requirements
Taints preventing scheduling

Getting Help

Valohai Support: [email protected]

Include in support requests:

Kubernetes version
Autoscaler type and version
Node pool configurations
Pod descriptions and events
Autoscaler logs

For Karpenter-specific issues:

Karpenter logs
NodePool and EC2NodeClass definitions
AWS IAM role configuration

PreviousOracle Kubernetes NextOn-Premises

Last updated 1 month ago

Was this helpful?