There are three ways to create new models and model versions: through the Model Hub web UI, through execution outputs, and by interacting with the Valohai API.
Web application
Creating a Model
On the Model Hub, click on “Create Model” and create a new model.
You only need to choose a name now, but note that this is the only time you can change the technical
model://
URI (also known as a ‘slug’) used to refer to this model in execution inputs and other places within your Valohai
projects.
Model name and URI maximum lengths
The model name and the URI identifier can each be up to 300 characters long. If you try to create a model with a name or URI that is too long, you will receive an error message.
Creating a Model Version
The Model Hub enables the creation of model versions as well. Open the overview page for a model, and click on “Create Version”. You can search for and add files from your project data library to the model version.
From execution outputs
Create the model using the Model Hub or API before adding versions to it with executions. Make note
of its model://<model-name>
URI.
Creating a Model Version
Use the valohai.model-versions
metadata statement to create model versions from
execution or task outputs. Create a metadata statement for each of the outputs to be added to the
new model version. A single execution or task can add at most one new version per model.
import valohai
import json
metadata = {
"model.pkl": {
"factory": "eu-02",
"valohai.tags": ["prod", "lemonade"],
"valohai.model-versions": ["model://<model-name>/"]
},
}
# save all the models separately;
save_path = "/valohai/outputs/model.pkl"
model.save(save_path)
metadata_path = "/valohai/outputs/valohai.metadata.jsonl"
with open(metadata_path, "w") as outfile:
for file_name, file_metadata in metadata.items():
json.dump({"file": file_name, "metadata": file_metadata}, outfile)
outfile.write("\n")
You can add tags or release notes to model versions created this way with an extended
valohai.model-versions
data structure:
metadata = {
"model.pkl":{
"valohai.model-versions": [{
"model_uri": "model://<model-name>/",
"model_version_tags": ["green", "cucumber"],
"model_release_note": "100% freshly processed"
}]
}}
The last output to be processed that has non-empty values for these fields will contribute them to the resulting model version.
Datasets
These fields can be used on a valohai.dataset-versions
metadata statement as well, which
will promote a model version from the resulting dataset version the datum contributes to. Note that if
multiple datums contribute to the same dataset version but not the same model, the resulting
dataset version (containing each datum) will be promoted to all specified models.
Legacy Approach of Creating a Model Version
In the legacy approach, metadata was stored in separate files for each model output. This required creating a dedicated metadata file per file.
For example:
import json
metadata = {
"valohai.model-versions": ["model://<model-name>/"]
}
save_path = '/valohai/outputs/model.pkl'
model.save(save_path)
metadata_path = '/valohai/outputs/model.pkl.metadata.json'
with open(metadata_path, 'w') as outfile:
json.dump(metadata, outfile)
This approach still works, so if you have it in your project, you can continue using it. However, the JSONL-based method is recommended for consolidating metadata for multiple files into a single file.
For more details on managing metadata for multiple files, see Data > Files > Save additional context.
Using the API
Creating a Model
You can create a new model by sending a POST request to https://app.valohai.com/api/v0/models/
.
{
"name": "<model-name>",
"slug": "<model-uri-name>",
"owner": <owner organization numeric ID>
}
Creating a Model Version
You can create a new model version from a list of datums by sending a POST request to https://app.valohai.com/api/v0/model-versions/
.
{
"model": "<model UUID>",
"datums": ["<datum 1 UUID>", "<datum 2 UUID>", ...]
}
You can alternatively promote a dataset version into a model version with the following POST request structure:
{
"model": "<model UUID>",
"dataset_version": "<dataset version UUID>"
}