Remote Access (SSH)

SSH access lets you connect directly to running Valohai executions for interactive debugging, IDE attachment, and real-time inspection.

Why use SSH debugging:

  • Attach your local IDE (VS Code, PyCharm) to debug code running on cloud infrastructure

  • Inspect execution state interactively without stopping the job

  • Tunnel to services like TensorBoard for real-time monitoring

  • Test code changes directly in the execution environment

Requirements

SSH access requires:

  1. Enterprise plan with on-premises, AWS, Azure, or GCP environments

  2. Firewall configuration by your organization administrator

  3. SSH key pair for authentication

SSH connects to the Docker container running your code, not the underlying VM. This means Valohai internals and the host OS are not accessible.

Quick Start

For Developers

  1. Generate SSH keys (or use auto-generated keys)

  2. Start execution with "Run with SSH" enabled

  3. Wait for IP address in logs

  4. Connect using SSH or your IDE

Choose your debugging approach:

Alternative for Kubernetes users:

For Administrators

Before developers can use SSH, complete the one-time setup below.

Administrator Setup

Step 1: Set Default SSH Port
  1. Navigate to Hi, [name]Manage [organization]

  2. Go to Settings

  3. Set Default Debug Port (must be above 1023, e.g., 2222)

Step 2: Configure Firewall Rules

AWS

  1. Open the Security Group named valohai-sg-workers

  2. Click Edit Inbound RulesAdd Rule

  3. Configure:

    • Type: Custom TCP

    • Port Range: Your debug port (e.g., 2222)

    • Source: 0.0.0.0/0 (or whitelist specific IPs/CIDR blocks)

    • Description: "Allows SSH to Valohai executions"

💡 Setting source to 0.0.0.0/0 allows connections from anywhere, but users still need the SSH private key to authenticate.

Google Cloud

  1. Create a firewall rule:

    • Name: valohai-fr-worker-ssh

    • Description: "Allows SSH to Valohai executions"

    • Network: Your Valohai VPC (e.g., valohai-vpc)

    • Direction: Ingress

    • Targets: Specified target tags: valohai-worker

    • Source: 0.0.0.0/0 (or whitelist specific IPs/CIDR blocks)

    • Protocols and ports: TCP: 2222 (your debug port)

Azure

  1. Open the Network Security Group associated with Valohai workers

  2. Add an Inbound security rule:

    • Source: Any (or specific IP ranges)

    • Source port ranges: *

    • Destination: Any

    • Destination port ranges: 2222 (your debug port)

    • Protocol: TCP

    • Action: Allow

    • Priority: 1000 (or appropriate for your NSG)

    • Name: AllowValohaiSSH

Generate SSH Keys

You can auto-generate keys in the Valohai UI or create them manually.

Option B: Manual Key Generation

Generate a 4096-bit RSA key pair:

ssh-keygen -t rsa -b 4096 -N '' -f valohai-debug-key

This creates:

  • valohai-debug-key.pub - Paste into Valohai UI before starting execution

  • valohai-debug-key - Use to connect (keep this secure)

⚠️ Never commit SSH keys to version control. Anyone with the private key can access your execution.

Regenerate keys periodically according to your organization's security policies.


Start an Execution with SSH

From Web UI

  1. Create or open an execution

  2. Enable Run with SSH

  3. Paste your public key (or auto-generate)

  4. Adjust TCP/IP port if needed (default uses organization setting)

  5. Click Create Execution

From Command Line

vh exec run --adhoc \
  --debug-key-file=/path/to/your-key.pub \
  --debug-port 2222 \
  train

Connect to Your Execution

1. Wait for IP Address

After starting the execution, watch the logs for the connection details:

The log will show:

SSH connection available at: 52.214.159.193:2222

💡 If you don't see the IP, ensure SSH was enabled when starting the execution.

2. Keep Execution Running

Important: Executions shut down when the command finishes. Add a sleep to keep it alive:

- step:
    name: train
    command:
      - python train.py {parameters}
      - sleep 1h  # Keeps execution alive for debugging

Why 1 hour? Setting a reasonable timeout prevents costly mistakes from infinite runtimes.

For IDE debugging, use debugger wait patterns (see VS Code or PyCharm guides) instead of sleep.

3. Choose Connection Method

Interactive Shell

ssh -i /path/to/private-key <IP> -p 2222 -t /bin/bash

This opens a bash session inside your execution container.

Run Single Command

ssh -i /path/to/private-key <IP> -p 2222 -t ps aux

Returns command output to your terminal.

SSH Tunnel (for services like TensorBoard)

ssh -i /path/to/private-key <IP> -p 2222 -L 5678:127.0.0.1:5678

Forwards port 5678 from the execution to your local machine.


Next Steps

Debug with your IDE:

Alternative for Kubernetes:


Common Issues

No IP address in logs?

  • Verify SSH was enabled when starting the execution

  • Check that you pasted the public key (not private key)

Connection refused?

  • Confirm firewall rules allow traffic on your debug port

  • Verify the port matches your organization settings

Authentication failed?

  • Ensure you're using the private key (not .pub file)

  • Check file permissions: chmod 600 /path/to/private-key

Execution shuts down too quickly?

  • Add sleep 1h at the end of your command

  • For IDE debugging, use debugger wait patterns

Last updated

Was this helpful?