# Hybrid Deployment

Deploy Valohai workers and storage to your Azure Resource Group while Valohai manages the application layer at app.valohai.com.

## What Gets Deployed

The Compute and Data Layer of Valohai can be deployed to your Azure Resource Group. This enables you to:

* Use your own Virtual Machine instances to run machine learning jobs
* Use your own Azure Blob Storage for storing training artifacts (trained models, preprocessed datasets, visualizations)
* Access databases and data warehouses directly from workers inside your network

Valohai doesn't have direct access to the virtual machine instances that execute machine learning jobs. Instead, it communicates with a static virtual machine or an Azure Managed Redis in your resource group that's responsible for storing the job queue, job states, and short-term logs.

## Prerequisites

**From Valohai:**

Contact **<support@valohai.com>** to receive:

* `valohai_assume_user` - ARN of the user Valohai will use to assume a role
* `queue_address` - DNS name assigned to your queue

**From your Azure account:**

* Admin access to Azure Portal
* Azure subscription
* Resource group (existing or new)
* Region selected

### Selecting Your Region

When selecting your region, consider:

* **GPU availability:** Regions have different collections of available GPU types
  * US: East US or West US 2 (widest array of GPU types in the United States)
  * EU: West Europe (widest array of GPU types in Europe)
  * Check the [Azure product availability page](https://azure.microsoft.com/en-us/explore/global-infrastructure/products-by-region/) for details
* **Data location:** Use the region where your data is located to reduce transfer times
* **GPU quota:** Use regions where you've already acquired GPU quota from Microsoft

## Create Resource Group

Navigate to **Resource Group Management** and select "Add".

**Configuration:**

* Subscription: Select your subscription
* Resource Group Name: `valohai` (or your preferred name)
* Region: Your selected region

Make note of the Resource Group name, as Valohai engineers will need this.

## Create Virtual Network

Valohai needs a virtual network. You can provide an existing vNet and subnets or create a new one.

### Create New Virtual Network

**1. Navigate to your resource group:** Select **Add** and search for "Virtual Network".

**2. Configure the network**

* Name: `valohai-vnet` (or your preferred name)
* Region: Same as your resource group
* IP addresses: Specify specific addresses or use default configuration

**3.** Click **Review + create**.

Valohai will spin up all virtual machines for ML jobs inside this virtual network.

#### Virtual Network Considerations

<table><thead><tr><th width="145.05078125"></th><th></th></tr></thead><tbody><tr><td><strong>Subnet size</strong></td><td><p>There is no hard requirement from Valohai. IP ranges and sizes depend on:</p><ul><li>Your organization's policy</li><li>Number of parallel jobs you need to run</li></ul><p>Valohai will have one static virtual machine with a public IP in the resource group. All other machines are created and destroyed according to scaling rules.</p></td></tr><tr><td><strong>Outbound internet access (egress)</strong></td><td><p>Strongly recommended to provide outbound network access to the machines.</p><p>The static <code>valohai-queue</code> machine needs outbound access to download assets to operate the job queue.</p></td></tr><tr><td><strong>Proxy configuration</strong></td><td>If you have a proxy in place, Valohai's workers can be configured to use it. Contact your Valohai contact with proxy details.</td></tr><tr><td><strong>Inbound access (ingress)</strong></td><td>Valohai does not need to access machines in your network. You can set up all resources yourself and block inbound access.</td></tr><tr><td><strong>Using existing virtual network</strong></td><td>Yes, you can use an existing virtual network. Provide details of the virtual network to Valohai Support.</td></tr></tbody></table>

## Create App Registration

Create an app registration in your Azure Entra ID to allow Valohai programmatic access to your resource group.

This allows Valohai to create and delete virtual machines for ML jobs. The scope can be limited to this resource group only.

### Create the App Registration

Navigate to **App Registration management panel**.

**1.** Click **New registration**.

**2. Configure registration**

* Name: `Valohai` (or your preferred name)
* Supported Account Type: "Accounts in this organizational directory only (Your Organization Name Here)"
* Redirect URI: Leave empty

**3. Note the IDs:** After creation, note these values:

* **Application (client) ID**
* **Directory (tenant) ID**

#### Create Client Secret

**1. Navigate to Certificates & Secrets:** In your app registration, select "Certificates & Secrets".

**2. Click "New client secret":**

* Description: `Valohai Secret` (or your preferred name)
* Expiry: 12 months (or according to your company policy)

Make note of the expiry time to share with your Valohai contact.

**3. Copy the secret value** from the table immediately - this is the only time you'll be able to see it.

## Configure Permissions

Grant the App Registration access to manage resources in your resource group.

### Create Custom Role

**1. Navigate to your resource group:**

Open your resource group and note the **Subscription ID**.

**2.** Navigate to **Access Control (IAM)** → **Roles** tab.

**3. Create custom role**

Click **Add custom role**.

* Role name: `ValohaiMasterRole`

**4.** Open the **Assignable scopes** tab and select the correct resource group(s).

**5. Define permissions**

Open the **JSON** tab and replace the permissions section with:

```json
{
  "permissions": [
    {
      "actions": [
        "Microsoft.Resources/deployments/validate/action",
        "Microsoft.Resources/deployments/write",
        "Microsoft.Resources/deployments/operationStatuses/read",
        "Microsoft.Network/virtualNetworks/subnets/read",
        "Microsoft.Network/networkSecurityGroups/read",
        "Microsoft.Network/networkSecurityGroups/join/action",
        "Microsoft.Network/networkSecurityGroups/write",
        "Microsoft.Network/publicIPAddresses/write",
        "Microsoft.Network/publicIPAddresses/read",
        "Microsoft.Network/publicIPAddresses/delete",
        "Microsoft.Network/publicIPAddresses/join/action",
        "Microsoft.Network/networkInterfaces/read",
        "Microsoft.Network/networkInterfaces/write",
        "Microsoft.Network/networkInterfaces/join/action",
        "Microsoft.Network/networkInterfaces/delete",
        "Microsoft.Network/networkInterfaces/effectiveRouteTable/action",
        "Microsoft.Network/networkInterfaces/effectiveNetworkSecurityGroups/action",
        "Microsoft.Network/networkInterfaces/UpdateParentNicAttachmentOnElasticNic/action",
        "Microsoft.Network/virtualNetworks/subnets/join/action",
        "Microsoft.Network/virtualNetworks/subnets/virtualMachines/read",
        "Microsoft.Network/networkSecurityGroups/securityRules/write",
        "Microsoft.Network/networkSecurityGroups/securityRules/read",
        "Microsoft.Network/networkSecurityGroups/securityRules/delete"
      ],
      "notActions": [],
      "dataActions": [],
      "notDataActions": []
    }
  ]
}
```

**6.** Click **Review + create** and save your changes.

### Assign Roles to Service Principal

**1.** In the IAM page, click **Add role assignment**.

* Search for `ValohaiMasterRole` and click next
* Select "User, group, or service principal"
* Click **Select members**
* Search for the service principal (the name you gave your app registration)
* Click **Review and assign**

**2. Add Virtual Machine Contributor role**

* Click **Add role assignment**
* Search for "Virtual Machine Contributor" and click next
* Select "User, group, or service principal"
* Click **Select members** and search for your service principal
* Click **Review and assign**

## Create Managed Identity

Create a managed user identity to assign to the virtual machine running the Valohai queue. The identity authenticates the VM with Key Vault to fetch saved secrets.

### Create User Identity

Navigate to **Azure Managed Identities**.

**Create new identity:**

* Resource Group: Your Valohai resource group
* Region: Your selected region
* Name: `valohai-queue`

## Create Key Vault

Store the queue password in Key Vault.

### Create the Key Vault

Navigate to **Azure Key Vaults**.

**Configuration:**

* Resource Group: Your Valohai resource group
* Key vault name: `valohai-queue-key`
* Region: Your selected region

**Access Policy:**

* Permission Model: Vault access Policy

**Add first access policy:**

* Click **Add Access Policy**
* Secret Permissions: Get
* Select Principal: `valohai-queue` (the managed identity)

**Add second access policy:**

* Click **Add Access Policy**
* Secret Permissions: Get
* Select Principal: `valohai` (the app registration)

### Create Secret

Add a new secret to your Key Vault.

**1. Navigate to your Key Vault**

**2. Create secret**

* Upload options: Manual
* Name: `ValohaiRedisSecret`
* Value: Generate a secret with letters and numbers (no special characters)

**3.** Click **Create**.

## Create Queue Instance

The Valohai queue instance handles the job queue. app.valohai.com submits jobs to the queue, and your workers read their job information from it.

### Create Virtual Machine

Navigate to **Virtual Machines** and create a new instance in the region and zone where you created your virtual network.

**Basic configuration:**

* Name: `valohai-queue`
* Image: Ubuntu Server 24.04 LTS
* Authentication Type: SSH Key
* Username: `ubuntu` (or your preferred username)
* Inbound Port rules: By default, allows Port 22 for SSH access (edit according to your policy)
* Size: B2s (2 vCPUs, 4GB of memory)

**Disks:**

* Disk Type: Premium SSD (locally redundant)

**Networking:**

* Virtual Network: The network you created earlier
* Subnet: default (or another subnet with outbound internet access)
* Public IP: Create new

**Create the virtual machine.**

### Assign User Identity

After the VM is created:

**1. Navigate to the VM resource**

**2. Open Identity section**

Under Settings, navigate to **Identity**.

**3. Add user-assigned identity**

* Open the **User assigned** tab
* Click **Add**
* Select `valohai-queue`
* Save changes

### Configure Inbound Rules

Open the Network Security Group associated with your instance (e.g., `valohai-queue-nsg`).

Navigate to **Inbound security rules** and add two new rules:

#### Rule 1: ValohaiApp

* Source: IP Address
* Source IP Addresses/CIDR: `34.248.245.191, 63.34.156.112`
* Source Port Ranges: `*`
* Destination Port Ranges: `63790`
* Protocol: TCP
* Name: `ValohaiApp`
* Description: Allows app.valohai.com and the Valohai scaling service to access the job queue to submit jobs and fetch job status

#### Rule 2: Valohai Certificate

* Source: Any
* Destination Port Ranges: `80`
* Protocol: TCP
* Name: `ValohaiLetsEncryptCertificate`
* Description: Allows using LetsEncrypt HTTP challenge to provision a certificate on the machine

> **Note:** You can provision your own certificate on the machine and not open port 80. Contact Valohai support for details.

## Initialize Queue Instance

SSH into your virtual machine and complete the Valohai job queue setup.

**1. Connect to the instance**

```shell
ssh -i your-key.pem ubuntu@<vm-public-ip>
```

**2. Run setup script**

Replace values before running:

* `<queue_address>` with your queue address from Valohai
* `VAULT_NAME` in the PASSWORD line with your vault's name (e.g., `valohai-myorg`)

```bash
export QUEUE=<queue_address>
ACCESS_TOKEN=$(curl 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fvault.azure.net' -H Metadata:true | python3 -c "import sys, json; print(json.load(sys.stdin)['access_token'])" $*)
PASSWORD=$(curl 'https://VAULT_NAME.vault.azure.net/secrets/ValohaiRedisSecret?api-version=2016-10-01' -H "Authorization: Bearer $ACCESS_TOKEN" | python3 -c "import sys, json; print(json.load(sys.stdin)['value'])" $*)
curl https://raw.githubusercontent.com/valohai/worker-queue/main/host/setup.sh | sudo QUEUE_ADDRESS=$QUEUE REDIS_PASSWORD=$PASSWORD bash
unset PASSWORD
```

## Register Resource Providers

Register resource providers to configure your subscription to work with the required services.

Valohai uses these resource providers:

* Microsoft.Compute
* Microsoft.Network

### Verify Registration

**1.** Go to **Azure Portal > Subscriptions**.

**2. Select the subscription** used for Valohai.

**3.** Navigate to **Resource providers** in the left menu.

**4. Register providers**

If not already registered, register:

* Microsoft.Compute
* Microsoft.Network

## Summary

Collect the following information to send to Valohai:

**Location and Account:**

* Region: `____________`
* Subscription ID: `____________`
* Resource Group Name: `____________`

**App Registration:**

* Directory (tenant) ID: `____________`
* Application (client) ID: `____________`
* Application Secret: `____________`
* Application Secret Expiry Date: `____________`

**Networking:**

* Virtual Network name: `____________`
* Subnet name (optional): `____________`

**Queue Instance:**

* Private IP: `____________`
* Public IP: `____________`

**Key Vault:**

* Name: `____________`

Share this information with your Valohai contact using your organization's secure communication method.

## Next Steps

After Valohai confirms your environment is configured:

**1. Verify the setup**

* Log in to app.valohai.com
* Check that Azure environments appear in your organization
* Create a test project
* Run a simple execution to verify workers launch correctly

**2. Configure additional resources**

* Add existing Azure storage accounts as data stores
* Set up private Docker registries
* Configure access to Azure databases

## Getting Help

**Valohai Support:** <support@valohai.com>

**Include in support requests:**

* Subscription ID
* Resource Group name
* Region
* Error messages or logs
* Steps already attempted


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.valohai.com/installation-and-setup/azure/hybrid.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
