Hybrid Deployment

Manually deploy Valohai's compute and data layer to your Azure environment

Deploy Valohai workers and storage to your Azure Resource Group while Valohai manages the application layer at app.valohai.com.

What Gets Deployed

The Compute and Data Layer of Valohai can be deployed to your Azure Resource Group. This enables you to:

  • Use your own Virtual Machine instances to run machine learning jobs

  • Use your own Azure Blob Storage for storing training artifacts (trained models, preprocessed datasets, visualizations)

  • Access databases and data warehouses directly from workers inside your network

Valohai doesn't have direct access to the virtual machine instances that execute machine learning jobs. Instead, it communicates with a static virtual machine or an Azure Managed Redis in your resource group that's responsible for storing the job queue, job states, and short-term logs.

Prerequisites

From Valohai:

Contact [email protected] to receive:

  • valohai_assume_user - ARN of the user Valohai will use to assume a role

  • queue_address - DNS name assigned to your queue

From your Azure account:

  • Admin access to Azure Portal

  • Azure subscription

  • Resource group (existing or new)

  • Region selected

Selecting Your Region

When selecting your region, consider:

  • GPU availability: Regions have different collections of available GPU types

    • US: East US or West US 2 (widest array of GPU types in the United States)

    • EU: West Europe (widest array of GPU types in Europe)

    • Check the Azure product availability page for details

  • Data location: Use the region where your data is located to reduce transfer times

  • GPU quota: Use regions where you've already acquired GPU quota from Microsoft

Create Resource Group

Navigate to Resource Group Management and select "Add".

Configuration:

  • Subscription: Select your subscription

  • Resource Group Name: valohai (or your preferred name)

  • Region: Your selected region

Make note of the Resource Group name, as Valohai engineers will need this.

Create Virtual Network

Valohai needs a virtual network. You can provide an existing vNet and subnets or create a new one.

Create New Virtual Network

1. Navigate to your resource group: Select Add and search for "Virtual Network".

2. Configure the network

  • Name: valohai-vnet (or your preferred name)

  • Region: Same as your resource group

  • IP addresses: Specify specific addresses or use default configuration

3. Click Review + create.

Valohai will spin up all virtual machines for ML jobs inside this virtual network.

Virtual Network Considerations

Subnet size

There is no hard requirement from Valohai. IP ranges and sizes depend on:

  • Your organization's policy

  • Number of parallel jobs you need to run

Valohai will have one static virtual machine with a public IP in the resource group. All other machines are created and destroyed according to scaling rules.

Outbound internet access (egress)

Strongly recommended to provide outbound network access to the machines.

The static valohai-queue machine needs outbound access to download assets to operate the job queue.

Proxy configuration

If you have a proxy in place, Valohai's workers can be configured to use it. Contact your Valohai contact with proxy details.

Inbound access (ingress)

Valohai does not need to access machines in your network. You can set up all resources yourself and block inbound access.

Using existing virtual network

Yes, you can use an existing virtual network. Provide details of the virtual network to Valohai Support.

Create App Registration

Create an app registration in your Azure Entra ID to allow Valohai programmatic access to your resource group.

This allows Valohai to create and delete virtual machines for ML jobs. The scope can be limited to this resource group only.

Create the App Registration

Navigate to App Registration management panel.

1. Click New registration.

2. Configure registration

  • Name: Valohai (or your preferred name)

  • Supported Account Type: "Accounts in this organizational directory only (Your Organization Name Here)"

  • Redirect URI: Leave empty

3. Note the IDs: After creation, note these values:

  • Application (client) ID

  • Directory (tenant) ID

Create Client Secret

1. Navigate to Certificates & Secrets: In your app registration, select "Certificates & Secrets".

2. Click "New client secret":

  • Description: Valohai Secret (or your preferred name)

  • Expiry: 12 months (or according to your company policy)

Make note of the expiry time to share with your Valohai contact.

3. Copy the secret value from the table immediately - this is the only time you'll be able to see it.

Configure Permissions

Grant the App Registration access to manage resources in your resource group.

Create Custom Role

1. Navigate to your resource group:

Open your resource group and note the Subscription ID.

2. Navigate to Access Control (IAM)Roles tab.

3. Create custom role

Click Add custom role.

  • Role name: ValohaiMasterRole

4. Open the Assignable scopes tab and select the correct resource group(s).

5. Define permissions

Open the JSON tab and replace the permissions section with:

{
  "permissions": [
    {
      "actions": [
        "Microsoft.Resources/deployments/validate/action",
        "Microsoft.Resources/deployments/write",
        "Microsoft.Resources/deployments/operationStatuses/read",
        "Microsoft.Network/virtualNetworks/subnets/read",
        "Microsoft.Network/networkSecurityGroups/read",
        "Microsoft.Network/networkSecurityGroups/join/action",
        "Microsoft.Network/networkSecurityGroups/write",
        "Microsoft.Network/publicIPAddresses/write",
        "Microsoft.Network/publicIPAddresses/read",
        "Microsoft.Network/publicIPAddresses/delete",
        "Microsoft.Network/publicIPAddresses/join/action",
        "Microsoft.Network/networkInterfaces/read",
        "Microsoft.Network/networkInterfaces/write",
        "Microsoft.Network/networkInterfaces/join/action",
        "Microsoft.Network/networkInterfaces/delete",
        "Microsoft.Network/networkInterfaces/effectiveRouteTable/action",
        "Microsoft.Network/networkInterfaces/effectiveNetworkSecurityGroups/action",
        "Microsoft.Network/networkInterfaces/UpdateParentNicAttachmentOnElasticNic/action",
        "Microsoft.Network/virtualNetworks/subnets/join/action",
        "Microsoft.Network/virtualNetworks/subnets/virtualMachines/read",
        "Microsoft.Network/networkSecurityGroups/securityRules/write",
        "Microsoft.Network/networkSecurityGroups/securityRules/read",
        "Microsoft.Network/networkSecurityGroups/securityRules/delete"
      ],
      "notActions": [],
      "dataActions": [],
      "notDataActions": []
    }
  ]
}

6. Click Review + create and save your changes.

Assign Roles to Service Principal

1. In the IAM page, click Add role assignment.

  • Search for ValohaiMasterRole and click next

  • Select "User, group, or service principal"

  • Click Select members

  • Search for the service principal (the name you gave your app registration)

  • Click Review and assign

2. Add Virtual Machine Contributor role

  • Click Add role assignment

  • Search for "Virtual Machine Contributor" and click next

  • Select "User, group, or service principal"

  • Click Select members and search for your service principal

  • Click Review and assign

Create Managed Identity

Create a managed user identity to assign to the virtual machine running the Valohai queue. The identity authenticates the VM with Key Vault to fetch saved secrets.

Create User Identity

Navigate to Azure Managed Identities.

Create new identity:

  • Resource Group: Your Valohai resource group

  • Region: Your selected region

  • Name: valohai-queue

Create Key Vault

Store the queue password in Key Vault.

Create the Key Vault

Navigate to Azure Key Vaults.

Configuration:

  • Resource Group: Your Valohai resource group

  • Key vault name: valohai-queue-key

  • Region: Your selected region

Access Policy:

  • Permission Model: Vault access Policy

Add first access policy:

  • Click Add Access Policy

  • Secret Permissions: Get

  • Select Principal: valohai-queue (the managed identity)

Add second access policy:

  • Click Add Access Policy

  • Secret Permissions: Get

  • Select Principal: valohai (the app registration)

Create Secret

Add a new secret to your Key Vault.

1. Navigate to your Key Vault

2. Create secret

  • Upload options: Manual

  • Name: ValohaiRedisSecret

  • Value: Generate a secret with letters and numbers (no special characters)

3. Click Create.

Create Queue Instance

The Valohai queue instance handles the job queue. app.valohai.com submits jobs to the queue, and your workers read their job information from it.

Create Virtual Machine

Navigate to Virtual Machines and create a new instance in the region and zone where you created your virtual network.

Basic configuration:

  • Name: valohai-queue

  • Image: Ubuntu Server 24.04 LTS

  • Authentication Type: SSH Key

  • Username: ubuntu (or your preferred username)

  • Inbound Port rules: By default, allows Port 22 for SSH access (edit according to your policy)

  • Size: B2s (2 vCPUs, 4GB of memory)

Disks:

  • Disk Type: Premium SSD (locally redundant)

Networking:

  • Virtual Network: The network you created earlier

  • Subnet: default (or another subnet with outbound internet access)

  • Public IP: Create new

Create the virtual machine.

Assign User Identity

After the VM is created:

1. Navigate to the VM resource

2. Open Identity section

Under Settings, navigate to Identity.

3. Add user-assigned identity

  • Open the User assigned tab

  • Click Add

  • Select valohai-queue

  • Save changes

Configure Inbound Rules

Open the Network Security Group associated with your instance (e.g., valohai-queue-nsg).

Navigate to Inbound security rules and add two new rules:

Rule 1: ValohaiApp

  • Source: IP Address

  • Source IP Addresses/CIDR: 34.248.245.191, 63.34.156.112

  • Source Port Ranges: *

  • Destination Port Ranges: 63790

  • Protocol: TCP

  • Name: ValohaiApp

  • Description: Allows app.valohai.com and the Valohai scaling service to access the job queue to submit jobs and fetch job status

Rule 2: Valohai Certificate

  • Source: Any

  • Destination Port Ranges: 80

  • Protocol: TCP

  • Name: ValohaiLetsEncryptCertificate

  • Description: Allows using LetsEncrypt HTTP challenge to provision a certificate on the machine

Note: You can provision your own certificate on the machine and not open port 80. Contact Valohai support for details.

Initialize Queue Instance

SSH into your virtual machine and complete the Valohai job queue setup.

1. Connect to the instance

ssh -i your-key.pem ubuntu@<vm-public-ip>

2. Run setup script

Replace values before running:

  • <queue_address> with your queue address from Valohai

  • VAULT_NAME in the PASSWORD line with your vault's name (e.g., valohai-myorg)

export QUEUE=<queue_address>
ACCESS_TOKEN=$(curl 'http://169.254.169.254/metadata/identity/oauth2/token?api-version=2018-02-01&resource=https%3A%2F%2Fvault.azure.net' -H Metadata:true | python3 -c "import sys, json; print(json.load(sys.stdin)['access_token'])" $*)
PASSWORD=$(curl 'https://VAULT_NAME.vault.azure.net/secrets/ValohaiRedisSecret?api-version=2016-10-01' -H "Authorization: Bearer $ACCESS_TOKEN" | python3 -c "import sys, json; print(json.load(sys.stdin)['value'])" $*)
curl https://raw.githubusercontent.com/valohai/worker-queue/main/host/setup.sh | sudo QUEUE_ADDRESS=$QUEUE REDIS_PASSWORD=$PASSWORD bash
unset PASSWORD

Register Resource Providers

Register resource providers to configure your subscription to work with the required services.

Valohai uses these resource providers:

  • Microsoft.Compute

  • Microsoft.Network

Verify Registration

1. Go to Azure Portal > Subscriptions.

2. Select the subscription used for Valohai.

3. Navigate to Resource providers in the left menu.

4. Register providers

If not already registered, register:

  • Microsoft.Compute

  • Microsoft.Network

Summary

Collect the following information to send to Valohai:

Location and Account:

  • Region: ____________

  • Subscription ID: ____________

  • Resource Group Name: ____________

App Registration:

  • Directory (tenant) ID: ____________

  • Application (client) ID: ____________

  • Application Secret: ____________

  • Application Secret Expiry Date: ____________

Networking:

  • Virtual Network name: ____________

  • Subnet name (optional): ____________

Queue Instance:

  • Private IP: ____________

  • Public IP: ____________

Key Vault:

  • Name: ____________

Share this information with your Valohai contact using your organization's secure communication method.

Next Steps

After Valohai confirms your environment is configured:

1. Verify the setup

  • Log in to app.valohai.com

  • Check that Azure environments appear in your organization

  • Create a test project

  • Run a simple execution to verify workers launch correctly

2. Configure additional resources

  • Add existing Azure storage accounts as data stores

  • Set up private Docker registries

  • Configure access to Azure databases

Getting Help

Valohai Support: [email protected]

Include in support requests:

  • Subscription ID

  • Resource Group name

  • Region

  • Error messages or logs

  • Steps already attempted

Last updated

Was this helpful?