Self-Hosted on EKS

Deploy a fully self-hosted Valohai installation in your AWS EKS cluster

This guide walks through setting up a self-hosted installation of Valohai in your AWS EKS cluster.

A self-hosted installation allows you to run all components of Valohai inside your own network. Users won't use app.valohai.com but a version of Valohai hosted by you.

Updates to the platform are delivered through Docker images.

Prerequisites

Existing infrastructure:

  • Existing Kubernetes cluster (AWS EKS)

  • AWS profile configured to access the EKS cluster from CLI

  • At least two subnets in the VPC for the load balancer

Tools:

Optional:

From Valohai:

Contact [email protected] to receive Docker images and configuration details before proceeding.

Clone the Repository

Clone the public GitHub repository to get the YAML files required for installation:

git clone https://github.com/valohai/valohai-self-hosted-k8.git
cd valohai-self-hosted-k8

Configure Settings

Database Configuration

Edit db-config-configmap.yaml and provide:

POSTGRES_PASSWORD: "<uppercase-lowercase-letters-numbers>"

Optimo Configuration

Edit optimo-deployment.yaml and provide:

env:
  - name: OPTIMO_BASIC_AUTH_PASSWORD
    value: "<uppercase-lowercase-letters-numbers>"

Application Configuration

Edit roi-config-configmap.yaml and provide the following values:

Required:

  • PASSWORD in DATABASE_URL - Must match POSTGRES_PASSWORD from db-config-configmap.yaml

  • SECRET_KEY - Generate random string (uppercase, lowercase, numbers)

  • REPO_PRIVATE_KEY_SECRET - Generate random string (uppercase, lowercase, numbers)

  • STATS_JWT_KEY - Generate random string (uppercase, lowercase, numbers)

  • OPTIMO_BASIC_AUTH_PASSWORD - Must match the password from optimo-deployment.yaml

  • URL_BASE - The address users will use to access your Valohai installation (e.g., https://mycompany.valohai.app.com)

Optional:

For additional setup options (login methods, email server), discuss with your Valohai contact.

Prepare the Valohai Docker Image

You will need a Docker image from Valohai. Your Valohai contact will provide this.

After downloading the image, push it to a registry where your cluster has access.

Update the image in valohai-deployment.yaml to point to your registry and image.

Note: You will also have separate pods for database (postgres), job queue (redis), and Bayesian optimization service (optimo). These images are publicly available, so no changes are needed to those YAML files.

Set Up the Namespace

By default, resources will be deployed to the default namespace. If you want to use another namespace, add it to all YAML files before applying them.

Set Up the Node Pool

Apply nodepool.yaml to create a Kubernetes node pool for Karpenter to scale nodes:

kubectl apply -f nodepool.yaml

You can modify the configuration based on your needs. For running the Valohai application and related components, we recommend a node with at least 4 CPUs and 16 GB RAM (e.g., AWS m5.xlarge).

Define Subnets in Ingress

Edit ingress.yaml and provide the IDs of at least two subnets for alb.ingress.kubernetes.io/subnets.

These subnets are used for the load balancer that will be set up for accessing the Valohai web UI.

Deploy Valohai

Deploy the Valohai setup:

kubectl apply -f .

Verify that deployments, services, and pods are available for valohai, postgres, redis, and optimo:

kubectl get pods -n <namespace>
kubectl get deployments -n <namespace>
kubectl get services -n <namespace>

Set Up the Load Balancer

Before adding the load balancer, set up the IAM policy and an IAM service account.

Note: The instructions use eksctl for creating the service account. For more information, refer to AWS documentation.

Create IAM Policy

curl -O https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.11.0/docs/install/iam_policy.json

aws iam create-policy \
    --policy-name AWSLoadBalancerControllerIAMPolicy \
    --policy-document file://iam_policy.json

Create IAM Service Account

eksctl create iamserviceaccount \
  --cluster=<name-of-your-cluster> \
  --namespace=kube-system \
  --name=aws-load-balancer-controller \
  --role-name AmazonEKSLoadBalancerControllerRole \
  --attach-policy-arn=arn:aws:iam::<your-account-id>:policy/AWSLoadBalancerControllerIAMPolicy \
  --approve

Install Load Balancer Controller with Helm

Add the EKS chart repository:

helm repo add eks https://aws.github.io/eks-charts

Install the aws-load-balancer-controller. Replace <name-of-your-cluster>, <AWS-region>, and <your-vpc-id> with your values:

helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
  --set 'clusterName=<name-of-your-cluster>' \
  --set serviceAccount.create=false \
  --set 'serviceAccount.name=aws-load-balancer-controller' \
  --set 'region=<AWS-region>' \
  --set 'vpcId=<your-vpc-id>' \
  -n kube-system

Verify that two aws-load-balancer-controller-<id> pods are running in the kube-system namespace:

kubectl get pods -n kube-system

Once the pods are running, get the ingress address:

kubectl get ingress

This address is used to access your Valohai installation.

Set Up Workers

Workers are needed to run your machine learning workloads. You have several options:

Kubernetes Workers

On-premises Servers

Autoscaled EC2 Instances

Important: Workers need to connect to the Redis queue on port 6379 set up in your cluster during this installation.

Set Up Data Store

Valohai requires an S3-compatible data store. Options include:

Discuss with your Valohai contact which option best fits your needs.

Next Steps

Contact [email protected] to complete the configuration and verify the installation.

Last updated

Was this helpful?