Clone Repositories During Execution
Fetch additional private repositories at runtime
Sometimes you need to access code or files from another repository during execution—like shared utilities, model definitions, or configuration files. This guide shows how to securely clone private repositories at runtime.
💡 Reproducibility note: Valohai tracks the commit of your connected repository automatically. Commits from other repositories cloned during execution are not tracked. For full reproducibility, consider using Git submodules instead.
Use Cases
Clone additional repositories when:
You need utility functions from a shared library
Model definitions are in a separate repo
Configuration files are centrally managed
You're testing code before making it a submodule
Don't use this for:
Datasets (use Valohai data inputs instead)
Model checkpoints (use execution outputs)
Anything that changes frequently (consider merging repos)
Clone a Private Repository
Step 1: Generate SSH Key
Create an SSH key pair for accessing the external repository:
ssh-keygen -t rsa -b 4096 -f valohai-external-keyThis creates:
valohai-external-key.pub– Public key (add to Git provider)valohai-external-key– Private key (add to Valohai)
Step 2: Add Public Key to Git Provider
Add the public key as a deploy key on the repository you want to clone:
GitHub:
Go to the repository → Settings → Deploy keys → Add deploy key
Paste the contents of
valohai-external-key.pubLeave "Allow write access" unchecked
GitLab:
Go to the repository → Settings → Repository → Deploy keys
Paste the contents of
valohai-external-key.pub
Bitbucket:
Go to the repository → Settings → Access keys → Add key
Paste the contents of
valohai-external-key.pub
Step 3: Add Private Key to Valohai
Store the private key as a secret environment variable:
Open your Valohai project
Go to Settings → Environment Variables
Add a new variable:
Name:
EXTERNAL_REPO_KEYValue: The private key with
\nreplacing newlines
Important: Valohai doesn't encode newlines automatically. Format your key like this:
-----BEGIN OPENSSH PRIVATE KEY-----\n<key content>\n-----END OPENSSH PRIVATE KEY-----Check Secret to hide the value from logs
Click Save
Step 4: Clone in Your Step
Update your valohai.yaml to clone the repository during execution:
- step:
name: train-with-external-repo
image: python:3.11
command:
# Install Git
- apt-get update
- apt-get install -y git
# Write the SSH key to a file
- echo -e $EXTERNAL_REPO_KEY > ~/external_key
- chmod 600 ~/external_key
# Configure Git to use the key
- export GIT_SSH_COMMAND="ssh -o StrictHostKeyChecking=no -i ~/external_key"
# Clone the external repository
- git clone [email protected]:username/external-repo.git /external-repo
# Verify the clone
- ls -la /external-repo
# Run your training script
- python train.pyThe cloned repository will be available at /external-repo during execution.
Access Files from the Cloned Repo
Use the cloned repository in your Python code:
import sys
sys.path.append('/external-repo')
# Import modules from the external repo
from external_repo.utils import preprocess_data
# Or read configuration files
with open('/external-repo/config.yaml', 'r') as f:
config = yaml.safe_load(f)Clone a Public Repository
Public repositories are simpler—no SSH key needed:
- step:
name: train-with-public-repo
image: python:3.11
command:
- apt-get update && apt-get install -y git
- git clone https://github.com/username/public-repo.git /public-repo
- python train.pyClone Multiple Repositories
You can clone multiple repos in the same execution:
- step:
name: train-with-multiple-repos
image: python:3.11
command:
- apt-get update && apt-get install -y git
# Set up SSH key
- echo -e $EXTERNAL_REPO_KEY > ~/key
- chmod 600 ~/key
- export GIT_SSH_COMMAND="ssh -o StrictHostKeyChecking=no -i ~/key"
# Clone first repo
- git clone [email protected]:username/repo1.git /repo1
# Clone second repo
- git clone [email protected]:username/repo2.git /repo2
# Use them in your script
- python train.pyClone Specific Branch or Commit
Clone a specific branch for reproducibility:
# Clone a specific branch
git clone -b feature-branch [email protected]:username/repo.git /repo
# Clone a specific commit
git clone [email protected]:username/repo.git /repo
cd /repo
git checkout abc123defOr use shallow clones to save time:
# Clone only the latest commit (faster)
git clone --depth 1 [email protected]:username/repo.git /repoSecurity Best Practices
Use read-only deploy keys
Don't grant write access
Create separate keys for each external repo
Rotate keys periodically
Mark environment variables as secrets
Always check the Secret box in Valohai
Secrets are hidden from logs and UI
They're still accessible in your code
Use SSH, not HTTPS with tokens
SSH keys are more secure than embedded tokens
They're easier to rotate
They don't expire
Don't log private keys
Never
echo $EXTERNAL_REPO_KEYwithout redirecting to a fileBe careful with
set -xor similar debugging flags
Troubleshooting
"Permission denied (publickey)"
The SSH key wasn't added correctly
Check that the public key is in the Git provider's deploy keys
Verify the private key format (must include
\nfor newlines)
"Host key verification failed"
Use
StrictHostKeyChecking=noinGIT_SSH_COMMANDThis is safe because you're cloning from known Git providers
"Repository not found"
Check the repository URL (must be SSH format:
git@...)Ensure the deploy key has access to the repo
Verify the repository exists and is accessible
"Command not found: git"
Install Git in your Docker image or step command
Use
apt-get install -y gitfor Debian/Ubuntu images
"Newline characters in key cause errors"
Valohai doesn't automatically encode newlines
Replace actual newlines with
\nin the environment variableThe key should be one long line with
\nmarkers
Alternative: Use Submodules
If you frequently clone the same repositories, consider Git submodules:
Submodules provide:
Automatic version tracking
Simpler setup (no environment variables)
Better reproducibility
Next Steps
Git Submodules for better version tracking
Environment Variables for managing secrets
Private Repositories for connecting your main repo
Last updated
Was this helpful?
