Valohai provides the capability to remotely access live executions using SSH (Secure Shell). SSH is a versatile and low-level protocol suitable for various tasks.
Use cases
- Inspecting the Execution: You can use an interactive terminal to inspect the details and progress of the execution in real-time.
- IDE Debugging: Easily connect your preferred integrated development environment (IDE) debugger such as VSCode or PyCharm to troubleshoot and debug your code during the execution.
- Integration with 3rd Party Tools: Establish low-latency connections with third-party tools like Tensorboard for real-time visualization and monitoring.
Requirements for SSH Access
- SSH access is available for enterprise users utilizing on-premises, Amazon Web Services, Microsoft Azure or Google Cloud environments within Valohai.
- Your organization’s administrator must enable SSH connections to Valohai workers and make any necessary adjustments to the firewall rules within your chosen cloud provider to facilitate secure SSH access.
To configure SSH access for your organization, your organization administrator should follow these steps:
Define the default SSH port for your organization
- Log in to your Valohai account and navigate to the organization management section by clicking on
Hi, <name>
in the top right menu, then selectManage <organization>.
- Under the organization controls, go to “Settings.”
- Set a Default Debug Port for your organization, ensuring that the chosen port number is above 1023.
Allow connections
AWS
- In the AWS Management Console, open the Security Group named “valohai-sg-workers.”
- Click “Edit Inbound Rules” to add a new inbound Custom TCP rule.
- Configure the rule as follows:
- Type: Custom TCP
- Port Range: The port number you specified in your Valohai organization’s settings.
- Source: Depending on your organization settings, you can set the Source as either “0.0.0.0/0” to allow connections from anywhere or whitelist specific IP ranges/source tags.
- Description: “Allows connecting to Valohai jobs over SSH”
Setting the source as “0.0.0.0/0” means that inbound connections will be allowed from all addresses. However, note that users will still need the SSH Private Key (generated below) to authenticate and successfully connect.
Google Cloud
- In Google Cloud Platform (GCP), create a new firewall rule with the following details:
- Name: valohai-fr-worker-ssh
- Description: “Allows connecting to Valohai jobs over SSH”
- Network: The network where your Valohai resources are created (e.g., valohai-vpc)
- Direction: Ingress
- Targets: Specified target tags: valohai-worker
- Source: Depending on the organization settings, you can set the Source as either “0.0.0.0/0” to allow connections from anywhere or whitelist specific IP ranges/source tags.
- Specified protocols and ports: TCP, with the port number you specified in your Valohai organization’s settings.
Setting the source as “0.0.0.0/0” means that inbound connections will be allowed from all addresses. However, users will still need the SSH Private Key (generated below) to authenticate and successfully connect.
Generate an SSH Key Pair
To create an SSH key pair for securing the connection in Valohai, you have the option to use an existing key pair or generate a new one.
Please note that for security reasons, it’s recommended to periodically regenerate your SSH key pair according to your organization’s security standards. Additionally, the SSH key pair used for debugging connections to the Docker container should be separate from any keys used to access the server where the container runs.
Automatic Keypair Generation (UI Option)
Valohai can automatically generate the key pair for you when starting an execution from the Valohai UI. If you prefer to create the keys yourself, follow the instructions below.
Manual SSH Keypair Generation
You can manually create a new SSH key pair using the ssh-keygen
command. Follow these steps:
- Open your terminal or command prompt.
- Use the following
ssh-keygen
command to create a new SSH key pair:ssh-keygen -t rsa -b 4096 -N '' -f my-debug-key
This command performs the following actions: -t rsa
: Specifies the key type as RSA.-b 4096
: Sets the key length to 4096 bits (adjustable to your security needs).-N ''
: Specifies an empty passphrase for the private key (you can add a passphrase for additional security if desired).-f my-debug-key
: Defines the file name for the generated keys (modify it to your preferred file name).- After running the command, two files will be generated in your current directory:
my-debug-key.pub
: This is the public key that you will need to paste into the Valohai UI before starting an execution.my-debug-key
: This is the private key that you will use to connect to the execution.
Ensure that you securely manage and store your private key (my-debug-key
) as it provides access to your Valohai execution. The public key (my-debug-key.pub
) will be used to establish the secure connection.
With your SSH key pair generated, you can now use it for debugging connections within Valohai.
Don’t include the keys in your version control
You should not include these keys in the version control. Anybody that gains access to the valohai-debug-key file contents will have access to your execution, so use appropriate caution.
Start an execution with SSH
You can start a job either from the command-line or from the web application.
Command-line
Start a Valohai execution with extra parameters debug-key-file for your public key file and debug-port for the port you have open for the debug connections.
vh exec run --adhoc --debug-key-file=/tmp/remote-debug-key.pub --debug-port 2222 train
Web app
Start a Valohai execution with the “Run with SSH” enabled.
If you created the keypair yourself, Copy-paste the entire contents of the my-debug-key.pub file into the text field.
Alternatively, you can click on the Generate new SSH key button and use the generate keys. Make sure to download and store the private key in a secure location.
Never include the keys in your version control! Finally, change the TCP/IP port if your network setup requires it.
Wait for an IP address
You need to start the Valohai execution before you can connect to it. Valohai will either run the execution on an existing virtual machine or create a new instance. Each machine has its own IP which is allocated by the cloud provider (e.g. AWS, GCP, Azure). You’ll need the IP in order to SSH into the execution.
Wait for the execution to start and watch for the first log events. Look for (something like) this:
You can now add the path to your private key and connect:
ssh -i 52.214.159.193 -p 2222 -t /bin/bash
If you’re not seeing the IP address in the logs, make sure that you have added the SSH key when starting the execution.
Open SSH Connection
Now depending on what your use-case, you may want to do one of these things:
- Run a single remote command
- Open an interactive shell
- Open an SSH tunnel
Run a single command
This will execute the command and return the results to your terminal.
# template
ssh -i -p -t <command>
# example
ssh -i /home/johndoe/.ssh/my-debug-key 52.214.159.193 -p 2222 -t ps aux
Open an interactive
Allows you to connect to the execution and run commands directly inside the Docker container that’s running your execution.
# template
ssh -i -p -t /bin/bash
# example
ssh -i /home/johndoe/.ssh/my-debug-key 52.214.159.193 -p 2222 -t /bin/bash
Open an SSH tunnel
The tempalte looks like this:
# template
ssh -i -p -t -L::
Your command could look like this:
# example
ssh -i ~/.ssh/remote-debug-key 34.245.207.101 -p 2222 -t -L5678:127.0.0.1:5678
How to keep the execution running?
You execution is designed to start, compute, and shut down on errors. When debugging, we want to keep the execution running even if it fails.
The safest way is to add a sleep command at the end of the execution.
python train.py {parameters}
sleep 1h
This way, the execution will wait for an hour and then shut down. It is better to set a reasonable time limit instead of an infinite uptime to avoid costly mistakes.
Attach a debugger
See our how-to guides for attaching a debugger to see how you can attach your local IDE to a Valohai job and wait for it to be attached before running your code.
Limitations
It is essential to understand that the SSH connection is not directly to the worker operating system.
We are opening remote access to the docker container running within that host operating system. It means that the Valohai platform internals and the rest of the host operating system are not available for inspection