Docker in Valohai

Valohai uses Docker images to define your runtime environment. This means you can run any code: Python, R, Julia, C++, libraries, or custom binaries, as long as it runs inside a container.

Depending on the environment, you might leverage a different container platform like Singularity.

Do I need a container?

Yes, but you don't need to build one yourself.

Every execution in Valohai requires a Docker image. The good news: you can start with public images that already exist.

Common starting points:

# TensorFlow
image: tensorflow/tensorflow:2.13.0-gpu
# PyTorch
image: pytorch/pytorch:2.0.1-cuda11.7-cudnn8-runtime
# Python (any framework)
image: python:3.11
# R
image: r-base:4.3.0

You reference an image by its full name. Valohai downloads it from the registry (Docker Hub, AWS ECR, etc.), caches it on the worker machine, and uses it for your execution.

What goes inside a Docker image?

A Docker image should contain your runtime dependencies:

  • System libraries (CUDA, OpenCV, FFmpeg)

  • Python/R packages (TensorFlow, PyTorch, scikit-learn)

Don't include:

  • Your ML code (comes from Git)

  • Your data (comes from data stores)

  • Secrets or credentials (use environment variables)

Do I need to rebuild every time I do changes to my code?

No. When experimenting, you can easily install additional packages as a part of your step.

Install additional packages without building a new Docker image

Install packages directly as a part of your step:

  command:
    - pip install transformers==4.30.0
    - python train.py

This is fast for iteration. You change your step.command in valohai.yaml and run. No Docker build needed.

When to build a custom image

Build a custom image when:

  • You use the same dependencies often: baking them into an image saves time on every execution, as you don't have the wait for the installs to complete.

  • You're building pipelines: avoids reinstalling packages at each pipeline node

  • You're running production workloads: faster startup, better versioning, more reproducible

  • You need system-level dependencies e.g., CUDA libraries, compiled binaries, specific OS packages

Key point: Build images for speed and reproducibility, not just because you added one package.

Does it need Valohai-specific components?

No. Your Docker image should be generic and reusable.

Valohai doesn't require anything special in your image. The platform handles data mounting, code injection, logging, etc. to the container based on your custom image.

Your image just needs to run your code. That's it.

Where can I host Docker images?

You can use any Docker registry:

Public registries:

  • Docker Hub (default for most public images)

  • GitHub Container Registry

Private registries:

  • AWS Elastic Container Registry (ECR)

  • Google Cloud Artifact Registry

  • Azure Container Registry

  • JFrog Artifactory

  • Self-hosted registries

Valohai supports authentication for all major private registries. See Private Docker Registries for setup instructions.

Common questions

Can I use root inside the container?

Yes. By default, Valohai runs containers as root, but your custom Docker image can override this using the USER directive in your Dockerfile.

What about custom ENTRYPOINT or CMD?

Valohai explicitly overrides ENTRYPOINT and replaces CMD with your step commands. This is by design, only your step commands execute inside the container.

If you need specific startup behavior, include it in your step commands or use an init script.

Next steps

Starting out? Use a public image like python:3.11 or tensorflow/tensorflow:2.13.0-gpu. Install packages in your script as needed.

Ready to optimize? Learn how to build custom images and follow our best practices.

Using private images? Set up authentication with your private registry.

Last updated

Was this helpful?