Convert to Production Script

Notebooks are built for experimentation. When your code is stable and you want to run it at scale, schedule it, or chain it into pipelines, convert it to a standard execution.

When to Convert

Move from notebook to execution when you:

  • Need to schedule runs or trigger them via API

  • Want to chain your job into a pipeline

  • Are ready to commit your code to version control

Executions are defined in valohai.yaml and run as scripts, not interactive notebooks. This gives you versioning, reproducibility, and production-grade orchestration.

The Two-Step Journey

Step 1: Test with Run Remote

Before migrating to valohai.yaml, validate your notebook runs end-to-end using Run Remote. This creates a standard execution from your notebook without leaving Jupyter.

Add this cell at the top of your notebook:

import valohai

valohai.prepare(
    step='my-experiment',
    image='python:3.9',
    default_inputs={
        'dataset': 's3://mybucket/data/train.csv'
    },
    default_parameters={
        'learning_rate': 0.001,
        'epochs': 10
    }
)

Click the Run Remote button in Jupyter. Valohai executes your entire notebook top-to-bottom in a fresh environment and creates a versioned execution.

This execution runs independently—you can close your notebook, and the job continues. Your team can also reproduce it without needing Jupyter.

Step 2: Migrate to valohai.yaml

Once your Run Remote execution succeeds, move your code to a proper script and define it in valohai.yaml.

Extract your notebook code to a Python script:

# train.py
import valohai
import pandas as pd

# Define inputs and parameters
valohai.prepare(
    step='training',
    image='python:3.9',
    default_inputs={
        'dataset': 's3://mybucket/data/train.csv'
    },
    default_parameters={
        'learning_rate': 0.001,
        'epochs': 10
    }
)

# Your training logic
learning_rate = valohai.parameters('learning_rate').value
epochs = valohai.parameters('epochs').value

df = pd.read_csv(valohai.inputs('dataset').path())
# ... training code ...

# Save outputs
model.save(valohai.outputs().path('model.h5'))

Define the step in valohai.yaml:

Generate the YAML entry by running the following on your local machine:

vh yaml step training

💡 training refers to the step name you defined in your Python.

This creates your valohai.yaml entry. Either create a new file or edit an existing one:

- step:
    name: training
    image: python:3.9
    command:
      - python train.py
    inputs:
      - name: dataset
        default: s3://mybucket/data/train.csv
    parameters:
      - name: learning_rate
        type: float
        default: 0.001
      - name: epochs
        type: integer
        default: 10

Now you can run this job from the UI, CLI, or API. You can parameterize it, chain it into pipelines, or use it in hyperparameter sweeps.

What Changes in Production

Notebooks:

  • Interactive cell-by-cell execution

  • Manual re-runs

  • Outputs saved when you stop the notebook

  • Versioned as .ipynb files

Executions:

  • Run scripts start-to-finish automatically

  • Triggered via UI, CLI, API, or schedules

  • Outputs saved as the script produces them

  • Versioned with Git commits and execution metadata

The core logic stays the same. You're just moving from interactive development to automated execution.

Migration Checklist

Before converting, ensure your notebook:

If your notebook relies on manual steps, refactor those into configurable parameters first.

Next Steps

Last updated

Was this helpful?