# Self-Hosted on EKS

This guide walks through setting up a self-hosted installation of Valohai in your AWS EKS cluster.

A self-hosted installation allows you to run all components of Valohai inside your own network. Users won't use app.valohai.com but a version of Valohai hosted by you.

Updates to the platform are delivered through Docker images.

## Prerequisites

**Existing infrastructure:**

* Existing Kubernetes cluster (AWS EKS)
* [AWS CLI installed](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html)
* AWS profile configured to access the EKS cluster from CLI
* At least two subnets in the VPC for the load balancer

**Tools:**

* [kubectl installed](https://kubernetes.io/docs/tasks/tools/)
* [Helm installed](https://helm.sh/)

**Optional:**

* [Karpenter installed](https://karpenter.sh/) for autoscaling (or another autoscaling solution)

**From Valohai:**

Contact **<support@valohai.com>** to receive Docker images and configuration details before proceeding.

## Clone the Repository

Clone the public GitHub repository to get the YAML files required for installation:

```shell
git clone https://github.com/valohai/valohai-self-hosted-k8.git
cd valohai-self-hosted-k8
```

## Configure Settings

### Database Configuration

Edit `db-config-configmap.yaml` and provide:

```yaml
POSTGRES_PASSWORD: "<uppercase-lowercase-letters-numbers>"
```

### Optimo Configuration

Edit `optimo-deployment.yaml` and provide:

```yaml
env:
  - name: OPTIMO_BASIC_AUTH_PASSWORD
    value: "<uppercase-lowercase-letters-numbers>"
```

### Application Configuration

Edit `roi-config-configmap.yaml` and provide the following values:

**Required:**

* `PASSWORD` in `DATABASE_URL` - Must match `POSTGRES_PASSWORD` from db-config-configmap.yaml
* `SECRET_KEY` - Generate random string (uppercase, lowercase, numbers)
* `REPO_PRIVATE_KEY_SECRET` - Generate random string (uppercase, lowercase, numbers)
* `STATS_JWT_KEY` - Generate random string (uppercase, lowercase, numbers)
* `OPTIMO_BASIC_AUTH_PASSWORD` - Must match the password from optimo-deployment.yaml
* `URL_BASE` - The address users will use to access your Valohai installation (e.g., <https://mycompany.valohai.app.com>)

**Optional:**

For additional setup options (login methods, email server), discuss with your Valohai contact.

## Prepare the Valohai Docker Image

You will need a Docker image from Valohai. Your Valohai contact will provide this.

After downloading the image, push it to a registry where your cluster has access.

Update the `image` in `valohai-deployment.yaml` to point to your registry and image.

> **Note:** You will also have separate pods for database (`postgres`), job queue (`redis`), and Bayesian optimization service (`optimo`). These images are publicly available, so no changes are needed to those YAML files.

## Set Up the Namespace

By default, resources will be deployed to the `default` namespace. If you want to use another namespace, add it to all YAML files before applying them.

## Set Up the Node Pool

Apply `nodepool.yaml` to create a Kubernetes node pool for Karpenter to scale nodes:

```shell
kubectl apply -f nodepool.yaml
```

You can modify the configuration based on your needs. For running the Valohai application and related components, we recommend a node with at least 4 CPUs and 16 GB RAM (e.g., AWS m5.xlarge).

## Define Subnets in Ingress

Edit `ingress.yaml` and provide the IDs of at least two subnets for `alb.ingress.kubernetes.io/subnets`.

These subnets are used for the load balancer that will be set up for accessing the Valohai web UI.

## Deploy Valohai

Deploy the Valohai setup:

```shell
kubectl apply -f .
```

Verify that deployments, services, and pods are available for `valohai`, `postgres`, `redis`, and `optimo`:

```shell
kubectl get pods -n <namespace>
kubectl get deployments -n <namespace>
kubectl get services -n <namespace>
```

## Set Up the Load Balancer

Before adding the load balancer, set up the IAM policy and an IAM service account.

> **Note:** The instructions use `eksctl` for creating the service account. For more information, refer to [AWS documentation](https://docs.aws.amazon.com/eks/latest/userguide/lbc-helm.html).

### Create IAM Policy

```shell
curl -O https://raw.githubusercontent.com/kubernetes-sigs/aws-load-balancer-controller/v2.11.0/docs/install/iam_policy.json

aws iam create-policy \
    --policy-name AWSLoadBalancerControllerIAMPolicy \
    --policy-document file://iam_policy.json
```

### Create IAM Service Account

```bash
eksctl create iamserviceaccount \
  --cluster=<name-of-your-cluster> \
  --namespace=kube-system \
  --name=aws-load-balancer-controller \
  --role-name AmazonEKSLoadBalancerControllerRole \
  --attach-policy-arn=arn:aws:iam::<your-account-id>:policy/AWSLoadBalancerControllerIAMPolicy \
  --approve
```

### Install Load Balancer Controller with Helm

Add the EKS chart repository:

```shell
helm repo add eks https://aws.github.io/eks-charts
```

Install the `aws-load-balancer-controller`. Replace `<name-of-your-cluster>`, `<AWS-region>`, and `<your-vpc-id>` with your values:

```bash
helm install aws-load-balancer-controller eks/aws-load-balancer-controller \
  --set 'clusterName=<name-of-your-cluster>' \
  --set serviceAccount.create=false \
  --set 'serviceAccount.name=aws-load-balancer-controller' \
  --set 'region=<AWS-region>' \
  --set 'vpcId=<your-vpc-id>' \
  -n kube-system
```

Verify that two `aws-load-balancer-controller-<id>` pods are running in the `kube-system` namespace:

```bash
kubectl get pods -n kube-system
```

Once the pods are running, get the ingress address:

```bash
kubectl get ingress
```

This address is used to access your Valohai installation.

## Set Up Workers

Workers are needed to run your machine learning workloads. You have several options:

**Kubernetes Workers**

* Follow the [Kubernetes workers installation guide](https://docs.valohai.com/installation-and-setup/kubernetes/workers)

**On-premises Servers**

* [On-premise installer](https://docs.valohai.com/installation-and-setup/on-premises)

**Autoscaled EC2 Instances**

* Follow the [AWS hybrid deployment guide](https://docs.valohai.com/installation-and-setup/aws/hybrid)

> **Important:** Workers need to connect to the Redis queue on port 6379 set up in your cluster during this installation.

## Set Up Data Store

Valohai requires an S3-compatible data store. Options include:

* [MinIO](https://min.io/docs/minio/kubernetes/upstream/index.html) running on the cluster
* [S3 bucket in AWS](https://docs.valohai.com) (link to S3 data store docs)

Discuss with your Valohai contact which option best fits your needs.

## Next Steps

Contact <support@valohai.com> to complete the configuration and verify the installation.
