Oracle Kubernetes
Connect Valohai with Oracle Kubernetes Engine (OKE) for ML workloads
This guide helps you connect Valohai with Oracle Kubernetes Engine (OKE).
Prerequisites
Tools:
kubectl - installation guide
helm - installation guide
oci-cli - installation guide
Oracle Cloud:
Oracle Cloud account
Permissions to create and manage OKE clusters
Step 1: Create the OKE Cluster
Set Up the Cluster
1. Navigate to cluster management
Log in and navigate to https://cloud.oracle.com/containers/clusters
2. Create cluster
Click Create cluster and select Quick Create.
3. Configure the cluster
Official setup guide: Oracle OKE Cluster Creation
Configuration:
Name: Give your cluster a name
Endpoint: Select Public Endpoint (unless in an air-gapped environment)
Worker type: Select Managed (if utilizing an autoscaler)
Worker visibility: Select Private Workers
Resources: Pick the resources you wish to allocate, including number of nodes
4. Review and create
Proceed with the Review section and click Create.
Step 2: Configure Local Access
Set Up OCI CLI
Create the .oci directory and configure CLI access:
oci setup configThis command will prompt you for various OCIDs. Refer to Oracle's documentation on finding OCIDs.
Add Public Key to Oracle
1. Generate and display the public key
cat ~/.oci/oci_api_key_public.pem2. Copy the public key
Copy the output of the command.
3. Add to API Keys
Add the public key to the API Keys associated with your Oracle profile.
Refer to Oracle's documentation on API signing keys.
Create kubeconfig File
Create the kubeconfig file with cluster and endpoint information:
oci ce cluster create-kubeconfig \
--cluster-id <CLUSTER-OCID> \
--file $HOME/.kube/_ociconfig \
--kube-endpoint PUBLIC_ENDPOINT \
--profile DEFAULTParameters:
--file $HOME/.kube/_ociconfig- Specifies the location and creates the kubeconfig file--kube-endpoint PUBLIC_ENDPOINT- Generates config for a public endpoint--profile DEFAULT- Specifies the profile to use when interacting with Oracle Cloud
Authenticate with Oracle CLI
Authenticate before proceeding with kubectl commands:
oci session authenticateStep 3: Install Kubernetes Workers
Install Valohai workers using Helm.
Install with Helm
helm upgrade --install \
-n valohai-workers \
--create-namespace \
valohai-workers \
valohai/valohai-workers \
-f ~/custom-values.yaml \
--kubeconfig /Users/<REPLACE>/.kube/_ociconfigReplace <REPLACE> with your username.
Note: Reach out to the Valohai team at [email protected] to get your
custom-values.yamlfile.
Custom Values File
The custom-values.yaml file contains:
siteName: SITE_NAME
imagePullCredentials:
email: EMAIL
username: USERNAME
password: PASSWORD
cleaner:
sentryDsn: SENTRY_URLThese values will be provided by Valohai.
Step 4: Complete Setup
Send Information to Valohai
Securely send the output from the Helm command to Valohai support at [email protected].
This allows Valohai to access the namespace of the cluster, which will finalize the process and enable Valohai to work with Oracle Kubernetes Engine.
Information Needed
The Helm output should include:
Namespace details
Service account information
Cluster access credentials
Step 5: Verify the Setup
After Valohai confirms the environment is configured:
1. Log in to app.valohai.com
Check that Oracle Kubernetes environments appear in your organization
2. Run a test execution
Create a test project
Run a simple execution
Verify it runs on your OKE cluster
3. Check results
Verify outputs are saved correctly
Check execution logs
Troubleshooting
Cannot authenticate with OCI CLI
Verify OCI configuration:
cat ~/.oci/configCheck that all OCIDs and paths are correct.
Test authentication:
oci iam user get --user-id <YOUR-USER-OCID>kubeconfig not working
Verify kubeconfig path:
echo $KUBECONFIGShould point to /Users/<your-username>/.kube/_ociconfig
Test connection:
kubectl get nodes --kubeconfig /Users/<your-username>/.kube/_ociconfigHelm installation fails
Check namespace:
kubectl get namespaces --kubeconfig /Users/<your-username>/.kube/_ociconfigVerify Helm can access cluster:
helm list -n valohai-workers --kubeconfig /Users/<your-username>/.kube/_ociconfigCheck custom-values.yaml:
Ensure all values are properly set
Verify no syntax errors in YAML
Pods not starting
Check pod status:
kubectl get pods -n valohai-workers --kubeconfig /Users/<your-username>/.kube/_ociconfigCheck pod logs:
kubectl logs <pod-name> -n valohai-workers --kubeconfig /Users/<your-username>/.kube/_ociconfigCommon issues:
Image pull errors (check credentials)
Insufficient resources (check node capacity)
Network policies blocking traffic
Additional Resources
Oracle Documentation:
Valohai Documentation:
Getting Help
Valohai Support: [email protected]
Include in support requests:
Oracle Cloud region
Cluster OCID
kubectl version
OCI CLI version
Helm output or error messages
Pod logs if available
Last updated
Was this helpful?
