# EKS for Real-Time Inference

Configure your Amazon EKS cluster to deploy Valohai models for real-time inference.

> **Note:** This guide is specifically for real-time inference deployments. It does not enable using Kubernetes for standard Valohai workers. For Kubernetes workers, see our [installation guide](/installation-and-setup/kubernetes/workers.md).

## Overview

Valohai can push deployments to an existing EKS cluster using standard Kubernetes APIs.

**Requirements:**

* app.valohai.com (`34.248.245.191`, `63.34.156.112`) must be able to access your cluster's API Server over HTTPS
* Your cluster can be configured to serve only private deployment endpoints

## Prerequisites

**Existing infrastructure:**

* EKS cluster
* kubectl configured to access your cluster

**Tools:**

* kubectl installed
* AWS CLI installed and configured

## Step 1: Install Gateway API CRDs and Gateway controller

* **Install Gateway API CRDs** — Follow the [Getting Started guide](https://gateway-api.sigs.k8s.io/guides/getting-started/#install-standard-channel) on the official Gateway API documentation.
* **Install a Gateway controller** — Choose and install a controller from the [official implementations list](https://gateway-api.sigs.k8s.io/implementations/#gateway-controller-implementation-status).
* **Create a `Gateway`** — Configure a Gateway for Valohai Deployments to use. This will serve as the base URL / domain for all Valohai Deployment Endpoints, including TLS termination for `https://`. Refer to your chosen controller's documentation for the exact configuration. Note down the Gateway **name** and **namespace** for later.
* **Grant permissions to manage HTTP routes** — See the `httproutes` resource in Step 3.

Note down the Gateway **name**, **namespace**, and whether it uses `https://` or `http://` — you'll need these in Step 6.

## Step 2: Create Kubernetes Service Account

Create a service account that Valohai will use to manage deployments.

```shell
kubectl create serviceaccount valohai-deployment
```

### Create Service Account Token

For Kubernetes 1.22 and higher, tokens are not created automatically:

```shell
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
type: kubernetes.io/service-account-token
metadata:
  name: valohai-deployment-token
  namespace: <NAMESPACE HERE>
  annotations:
    kubernetes.io/service-account.name: valohai-deployment
EOF
```

Replace `<NAMESPACE HERE>` with your namespace (or use `default`).

### Get the Token

Retrieve the token:

```shell
kubectl get serviceaccounts valohai-deployment -o json
kubectl get secret valohai-deployment-token -o jsonpath='{.data.token}' | base64 --decode
```

Save this token value to provide to Valohai.

## Step 3: Create Kubernetes Role

Create a role that defines the permissions Valohai needs.

Create a file `valohai-deployment-role.yml`:

```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: valohai-deployment-role
rules:
  - apiGroups: [""]
    resources: ["events", "namespaces"]
    verbs: ["get", "list", "watch"]
  - apiGroups: [""]
    resources: ["pods", "pods/log", "services"]
    verbs: ["create", "delete", "deletecollection", "get", "list", "patch", "update", "watch"]
  - apiGroups: ["apps", "extensions"]
    resources: ["deployments", "deployments/rollback", "deployments/scale"]
    verbs: ["create", "delete", "deletecollection", "get", "list", "patch", "update", "watch"]
  - apiGroups: ["gateway.networking.k8s.io"]
    resources: ["httproutes"]
    verbs: ["create", "delete", "deletecollection", "get", "list", "patch", "update", "watch"]
```

If you need to limit access to a certain namespace, add `namespace: <NAMESPACE>` under metadata.

Apply the role:

```shell
kubectl apply -f valohai-deployment-role.yml
```

## Step 4: Create Role Binding

Bind the role to the service account.

Replace `<namespace>` with your namespace if you defined one when creating the service account:

```shell
kubectl create rolebinding valohai-deployment-binding \
    --role=valohai-deployment-role \
    --serviceaccount=<namespace>:valohai-deployment
```

## Step 5: Configure Repository Access

Ensure your cluster's nodes can pull from the repository that Valohai pushes images to.

### AWS IAM User for ECR

Create an IAM user that Valohai can use to access the cluster and push to your ECR.

**1. Create IAM user**

Navigate to **IAM** in AWS Console and create a user named `valohai-eks-user`.

* Enable **Programmatic access** and **Console access**

**2. Attach policies**

Attach these existing policies:

* `AmazonEC2ContainerRegistryFullAccess`
* `AmazonEKSServicePolicy`

**3. Create custom policy**

Create a new policy named `VH_EKS_USER`:

```json
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "1",
      "Effect": "Allow",
      "Action": "eks:ListClusters",
      "Resource": "*"
    }
  ]
}
```

**4. Attach custom policy**

Go back to the user, refresh, and attach the `VH_EKS_USER` policy.

**5. Save credentials**

Store the access key and secret in a safe place. You'll provide these to Valohai.

### Alternative: Other Container Registries

You can use standard Docker login (username/password) credentials for:

* Azure Container Registry
* GitLab Container Registry
* Artifactory
* Docker Hub
* Other registries

Create a separate account for Valohai to push to your repository.

## Step 6: Collect Information

Gather the following information to send to Valohai:

**Cluster details:**

Find these on your cluster's page in EKS:

* Cluster name: `____________`
* AWS region: `____________`
* API server endpoint: `____________`
* Cluster ARN: `____________`
* Certificate authority (cluster-certificate-data): `____________`

**Service account:**

* `valohai-deployment` service account token: `____________`

**Networking:**

* Gateway name: `____________`
* Gateway namespace: `____________`
* Base URL scheme (`https://` or `http://`): `____________`

**Container registry:**

ECR:

* ECR name/URL: `____________`
  * Example: `accountid.dkr.ecr.eu-west-1.amazonaws.com`
  * Find this when creating a new repository in ECR

IAM credentials:

* `valohai-eks-user` access key ID: `____________`
* `valohai-eks-user` secret access key: `____________`

Send the collected information to your Valohai contact at **<support@valohai.com>** using your organization's secure communication method.

Your Valohai contact will complete the configuration on the platform side.

## Next Steps

After Valohai confirms the setup:

**1. Test deployments**

* Create a deployment in Valohai
* Verify it deploys to your EKS cluster
* Check that the endpoint is accessible

**2. Configure access controls**

* Review security groups for the load balancer
* Configure authentication for endpoints if needed
* Set up network policies in Kubernetes

**3. Monitor deployments**

* Set up CloudWatch monitoring for your deployments
* Configure alerts for deployment health
* Review resource usage and costs

## Troubleshooting

### Cannot access cluster API

**Verify API endpoint:**

```shell
aws eks describe-cluster --name CLUSTER_NAME --query "cluster.endpoint"
```

**Check security group:**

Ensure the cluster security group allows access from Valohai IPs (`34.248.245.191/32`, `63.34.156.112/32`).

### Service account token invalid

**Regenerate token:**

Delete and recreate the secret:

```shell
kubectl delete secret valohai-deployment-token
kubectl apply -f - <<EOF
apiVersion: v1
kind: Secret
type: kubernetes.io/service-account-token
metadata:
  name: valohai-deployment-token
  namespace: default
  annotations:
    kubernetes.io/service-account.name: valohai-deployment
EOF
```

**Get new token:**

```shell
kubectl get secret valohai-deployment-token -o jsonpath='{.data.token}' | base64 --decode
```

### HTTPRoutes not working

**Check that HTTPRoutes were created:**

```bash
kubectl get httproutes -n <valohai-namespace>
```

**Check Gateway status:**

```bash
kubectl describe gateway <gateway-name> -n <gateway-namespace>
```

**Verify the Gateway controller is running** by checking the controller's pods according to your chosen controller's documentation.

### Cannot push to ECR

**Test ECR access:**

```shell
aws ecr get-login-password --region REGION | docker login --username AWS --password-stdin ACCOUNT.dkr.ecr.REGION.amazonaws.com
```

**Verify IAM permissions:**

Ensure `valohai-eks-user` has `AmazonEC2ContainerRegistryFullAccess`.

### Deployments fail

**Check deployment logs:**

```shell
kubectl logs -l app=your-deployment -n default
```

**Check pod events:**

```shell
kubectl describe pod POD_NAME -n default
```

**Common issues:**

* Image pull errors (check ECR permissions)
* Resource limits too low
* Network policies blocking traffic
* Service ports misconfigured

## Getting Help

**Valohai Support:** <support@valohai.com>

**Include in support requests:**

* EKS cluster name and region
* Kubernetes version
* Service account token status
* kubectl version
* Error messages from deployments or Kubernetes events
* Gateway and HTTPRoute configurations


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.valohai.com/installation-and-setup/aws/kubernetes-inference.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
