Outputs: Save Models and Files

When your execution ends, Valohai needs to know which files to keep. This guide shows how to save your trained models, datasets, and other files.

💡 Already saving files locally? Just change the path to /valohai/outputs/ and Valohai handles the rest.

How Outputs Work

  1. Save to a special directory: /valohai/outputs/

  2. Valohai uploads automatically to your cloud storage (AWS S3, Azure Blob, Google Cloud Storage, MinIO, etc.)

  3. Every file is versioned, no accidental overwrites

The /valohai/outputs/ directory already exists in every execution. Just save your files there.

Update Your Code

Before (Local)

# Saving locally
model.save('model.h5')
df.to_csv('results.csv')

After (Valohai)

# Save to Valohai outputs
model.save('/valohai/outputs/model.h5')
df.to_csv('/valohai/outputs/results.csv')

That's the only change needed.

Common Patterns

Save Multiple Files

# Models
model.save('/valohai/outputs/model.h5')

# Metrics
with open('/valohai/outputs/metrics.json', 'w') as f:
    json.dump(metrics, f)

# Plots
plt.savefig('/valohai/outputs/loss_curve.png')

Preserve Directory Structure

Your folder structure is maintained:

# This structure...
"""
/valohai/outputs/
├── models/
│   └── best_model.h5
├── logs/
│   └── training.log
└── visualizations/
    ├── loss.png
    └── accuracy.png
"""
# ...stays exactly the same in cloud storage

Save During Training

Don't wait until the end, save checkpoints as you go. Files that are marked as read only under the /valohai/outputs/ directory will be uploaded immediately to the data store.

import os
from stat import S_IREAD, S_IRGRP, S_IROTH

# Some code

for epoch in range(epochs):
    # Training code...
    
    if epoch % 10 == 0:
        filename = f'/valohai/outputs/checkpoint_epoch_{epoch}.h5'
        model.save(filename)
        os.chmod(filename, S_IREAD|S_IRGRP|S_IROTH)
Optional: Use the valohai-utils Python helper tool

The valohai-utils library offers convenience methods:

import valohai

# Generate output path
output_path = valohai.outputs().path("model.h5")
model.save(output_path)

# Live logging
valohai.outputs().live_upload("training.log")

This is optional, direct paths work just fine.

File Types and Sizes

Save any file type your code can create:

  • Models: .h5, .pkl, .pt, .onnx, .joblib

  • Data: .csv, .parquet, .json, .npz

  • Images: .png, .jpg, .pdf

  • Archives: .tar, .zip, .gz

  • R files: .rdata, .rds

No size limits, save what you need.

Using Aliases (Human-Friendly Names)

Each job is isolated and versioned on it’s own. This means that if job #1 outputs a file called model.h5 it will be versioned on it’s own, and when job #2 outputs a model.h5 Valohai won’t override the first file but create a separate file for it.

We recommend looking into the Valohai Aliases if you’re looking for friendly names for a specific version of a file:

  • latest-model-project-b

  • production-model-project-a

Now you can reference this model as datum://best-model in future jobs instead of a long cloud storage URL.

Where Do Outputs Go?

Valohai automatically uploads your outputs to:

  • Cloud: Your configured AWS S3, Azure Blob Storage, Google Cloud Storage, OCI

  • On-premise: MinIO, NetApp, or other S3-compatible storage

You don't need to write upload code. Save locally, Valohai handles the rest.

Quick Reference

Essential Pattern

# Just prepend /valohai/outputs/ to your existing save paths
model.save('/valohai/outputs/model.h5')

Supported Storage

  • AWS S3

  • Azure Blob Storage

  • Google Cloud Storage

  • MinIO

  • NetApp

  • Any S3-compatible storage

Good to Know

  • Directory structure is preserved

  • Every execution's outputs are versioned separately

  • No file size or count limits

  • Files appear in UI immediately after execution


Last updated

Was this helpful?