After you define parameters in your valohai.yaml, you can add various placeholders to your commands which get replaced with the actual values.
It is common to use these in the following manner:
- step:
name: train-model
image: tensorflow/tensorflow:0.12.1-devel-gpu
command: python train.py {parameters}
parameters:
- name: max_steps
pass-as: --max_steps={v}
type: integer
default: 300
This makes it easy to track used parameters for automated hyperparameter optimization and following how different parameters affect your model accuracy.
{parameters} placeholder
{parameters}
injects all parameters to its position in the commands.
For example:
- step:
name: train-model
image: python:3.6
command:
- python train.py {parameters}
parameters:
- name: max-steps
type: integer
description: Number of steps to run the trainer
default: 300
- name: learning-rate
type: float
pass-as: --lr={v}
description: Initial learning rate
default: 0.001
- name: architecture
type: string
pass-as: arc {v}
default: 10xRELU-SoftMax
optional: true
The above would generate the following command by default:
python train.py --max-steps=300 --lr=0.001 arc 10xRELU-SoftMax
When a value is undefined, the parameter will appear with its default value, except for the type flag. Flags will only ever appear, if they are defined with value set to true.
{parameter:} placeholder
You can also use singular parameters using the {parameter:<NAME>}
syntax.
For example:
- step:
name: preprocess-and-train
image: python:3.6
command:
- python preprocess.py {parameter:train-split}
- python train.py {parameter:learning-rate}
parameters:
- name: train-split
type: integer
default: 80
- name: learning-rate
pass-as: --lr={v}
type: float
default: 0.001
The above would generate the following commands by default:
python preprocess.py --train-split=80
python train.py --lr=0.001
{parameter-value:} placeholder
If you wish to ignore pass-as definition, you can use {parameter-value:<NAME>}
to pass only the parameter value. This is essentially the same as defining pass-as: "{v}"
.
For example:
- step:
name: preprocess
image: python:3.6
command:
- python preprocess.py {parameter-value:train-split} {parameter-value:style}
parameters:
- name: train-split
type: integer
default: 80
- name: style
pass-as: -s={v}
type: string
default: nested
The above would generate the following command by default:
python preprocess.py 80 nested
There are no limits how many {parameters}
, {parameter:<NAME>}
, and {parameter-value:<NAME>}
placeholders you can have in a set of commands so use them to your heart’s content!
Placeholders in inputs
Parameter placeholders can also be used in inputs, for example:
- step:
name: train-model
image: tensorflow/tensorflow:0.12.1-devel-gpu
command: python train.py
parameters:
- name: device_id
type: integer
default: 455
- name: family
type: string
default: "katti"
inputs:
- name: images
default: s3://factories/{parameter:device_id}/images/*
- name: prod-model
default: datum://prod-model-{parameter:family}
- name: sensor-data
default: dataset://sensor-data-{parameter:family}/{parameter:device_id}
Multiple value placeholders
Parameters can have multiple values, and they can be used as command placeholders. In this case, the default should be a list.
Multiple inputs
Multiple value parameters are not supported as input placeholders.
- step:
name: train-model
image: tensorflow/tensorflow:0.12.1-devel-gpu
command: python train.py {parameters}
parameters:
- name: seed_values
type: integer
multiple: separate
default: [455, 922, 1344]
When used as a command placeholder, multiple value representation can be customized.
When using multiple: separate
, items are joined using a comma into a single value, and passed as a single
command argument. The joining can be customized into another character with the multiple-separator
property.
The default values [455, 922, 1344]
would result in the following command:
python train.py --seed_values=455,922,1344
When using multiple: repeat
instead, each item is passed as a repeating command argument. The same defaults
result in the following command:
python train.py --seed_values=455 --seed_values=922 --seed_values=1344
You can also read parameters from configuration files as JSON or YAML, which can be easier with complicated parameters that are difficult to parse. Multiple value parameters are represented as a list in file configurations, no parsing and joining required.
import json
# FYI: alternate YAML format can be found in /valohai/config/parameters.yaml
with open("/valohai/config/parameters.json") as f:
params = json.load(f)
print(params["seed_values"]) # => [455, 922, 1344]
valohai-utils
can also read parameters from file configuration, but note that command line parameters
are parsed first. If you wish to read parameters from the file using valohai-utils
,
do not also use {parameters}
placeholders.