Migrating existing Python projects to Valohai

Notebook users

When running Valohai executions from Jupyter Notebooks, you don’t need to setup the valohai.yaml configuration file, as it will be automatically generated by Jupyhai.

See Jupyter Notebooks with Valohai for details on running Notebooks on Valohai.

Bringing your existing projects to Valohai is straightforward.

  1. Create a Project and connect it to your Git repository

  2. Add a valohai.yaml configuration file to the root of your repository

  3. Inside the configuration file:
    - step:
        name: Train model
        image: tensorflow/tensorflow:2.1
        command: python train.py {parameters}
        inputs:
            - name: my-sample-input
              default: s3://mybucket/data/mydata.csv
        parameters:
            - name: learningrate
              type: float
              default: 0.001
    
  4. Update your code to read data from VH_INPUTS_DIR and save data to VH_OUTPUTS_DIR, instead of reading/saving to a local disk or a cloud storage location.
    import os
    import pandas as pd
    
    # Directory for all downloaded input datasets
    VH_INPUTS_DIR = os.getenv('VH_INPUTS_DIR')
    # Directory where you should save all files that you want to keep
    VH_OUTPUTS_DIR = os.getenv('VH_OUTPUTS_DIR')
    
    # Load input file from the inputs directory
    # Note that Valohai creates a new folder for each defined input (e.g. my-sample-input) and saves the individual files in that folder
    # e.g. /valohai/inputs/my-sample-input/mydata.csv
    df = pd.read_csv(os.path.join(VH_INPUTS_DIR, "my-sample-input", "mydata.csv"))
    
  5. Read Valohai parameters in your code
    import argparse
    
    def parse_args():
        parser = argparse.ArgumentParser()
        parser.add_argument('--learningrate', type=float, default=0.001)
        return parser.parse_args()
    
    args = parse_args()
    
  6. Start collecting important metrics by printing Valohai Metadata
    import json
    
    print(json.dumps({
        'step': epoch,
        'accuracy': str(logs['acc']),
    }))
    
Metadata chart comparison

See also

Find a example of a valohai.yaml file in our quickstart tutorial or for a more complex example see the TensorFlow sample