Create and Manage Models
Create models in Model Hub, generate versions from training executions, manage approval states, and use models in production pipelines.
Overview
Model Hub workflow:
Create model — Define model in registry (one-time setup)
Train model — Run training execution
Create version — Automatically add model version from outputs
Review & approve — Validate metrics, approve for production
Deploy — Use
model://URI in production workflows
Create a Model
Models are containers for versions. Create a model once, then add multiple versions over time.
Via Web UI
Navigate to Models (in project or organization view)
Click "Create Model"
Enter Model name (e.g., "flower")
Optionally associate with a project
Click "Create"

⚠️ Important: The model URI (
flower) becomesmodel://flower/and cannot be changed later. Choose carefully.
Project-Associated Models
Organization view: All models visible Project view: Only associated models visible (cleaner organization)
To associate:
Creating from project view → Automatically associated
Creating from org view → Choose project during creation
Change later → Organization Settings → Models
Benefit: Organize models by project while still using them across projects.
Create Model Versions from Training
The recommended approach: automatically create model versions when training executions complete.
Step 1: Train and Save Model
train.py:
import pickle
import json
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
import pandas as pd
# Load data
data = pd.read_csv('/valohai/inputs/training-data/data.csv')
X = data.drop('target', axis=1)
y = data['target']
# Train model
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = RandomForestClassifier(n_estimators=100, max_depth=10)
model.fit(X_train, y_train)
# Evaluate
accuracy = model.score(X_test, y_test)
print(f"Accuracy: {accuracy:.4f}")
# Save model file to outputs
model_path = '/valohai/outputs/model.pkl'
with open(model_path, 'wb') as f:
pickle.dump(model, f)
print(f"Saved model to {model_path}")Step 2: Create Model Version with Metadata
In your train.py add valohai.model-versions metadata to create model version automatically:
# After saving model file, create metadata
metadata = {
"model.pkl": {
"valohai.model-versions": ["model://flower/"],
"valohai.tags": ["computer-vision", "production-candidate"],
"accuracy": accuracy,
"training_samples": len(X_train),
"test_samples": len(X_test)
}
}
# Save metadata file
metadata_path = '/valohai/outputs/valohai.metadata.jsonl'
with open(metadata_path, 'w') as f:
for filename, file_metadata in metadata.items():
json.dump({"file": filename, "metadata": file_metadata}, f)
f.write('\n')
print(f"Created model version in model://customer-churn/")What happens:
Execution saves
model.pklto/valohai/outputs/Metadata file tells Valohai to add this to
model://flower/New model version created in Pending state
Version includes
model.pkland all metadata
Step 3: Add Release Notes and Tags
Include additional metadata for the version:
metadata = {
"model.pkl": {
"valohai.model-versions": [{
"model_uri": "model://flower/",
"model_version_tags": ["improved-recall", "production-ready"],
"model_release_note": "Improved recall by 8% using balanced class weights"
}],
"valohai.tags": ["computer-vision", "v2-architecture"],
"accuracy": 0.94,
"precision": 0.92,
"recall": 0.89,
"f1_score": 0.905
}
}
metadata_path = '/valohai/outputs/valohai.metadata.jsonl'
with open(metadata_path, 'w') as f:
for filename, file_metadata in metadata.items():
json.dump({"file": filename, "metadata": file_metadata}, f)
f.write('\n')Metadata fields:
model_uri— Which model to add version tomodel_version_tags— Tags specific to this versionmodel_release_note— Description of changes/improvementsCustom properties — Any metrics or context (accuracy, etc.)
Complete Training Example
train.py:
import pickle
import json
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
import pandas as pd
# Load training data
print("Loading training data...")
data = pd.read_csv('/valohai/inputs/training-data/data.csv')
X = data.drop('churn', axis=1)
y = data['churn']
# Split data
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42, stratify=y
)
print(f"Training set: {len(X_train)} samples")
print(f"Test set: {len(X_test)} samples")
# Train model
print("Training model...")
model = GradientBoostingClassifier(
n_estimators=100,
learning_rate=0.1,
max_depth=5,
random_state=42
)
model.fit(X_train, y_train)
# Evaluate
print("Evaluating model...")
y_pred = model.predict(X_test)
metrics = {
'accuracy': accuracy_score(y_test, y_pred),
'precision': precision_score(y_test, y_pred),
'recall': recall_score(y_test, y_pred),
'f1_score': f1_score(y_test, y_pred),
'training_samples': len(X_train),
'test_samples': len(X_test)
}
print(f"Accuracy: {metrics['accuracy']:.4f}")
print(f"Precision: {metrics['precision']:.4f}")
print(f"Recall: {metrics['recall']:.4f}")
print(f"F1 Score: {metrics['f1_score']:.4f}")
# Save model
print("Saving model...")
model_path = '/valohai/outputs/model.pkl'
with open(model_path, 'wb') as f:
pickle.dump(model, f)
# Save feature importance
feature_importance = pd.DataFrame({
'feature': X.columns,
'importance': model.feature_importances_
}).sort_values('importance', ascending=False)
feature_importance.to_csv('/valohai/outputs/feature_importance.csv', index=False)
# Create model version with complete metadata
metadata = {
"model.pkl": {
"valohai.model-versions": [{
"model_uri": "model://flower/",
"model_version_tags": ["improved-recall", "production-ready"],
"model_release_note": "Improved recall by 8% using balanced class weights"
}],
"valohai.tags": ["computer-vision", "v2-architecture"],
"accuracy": 0.94,
"precision": 0.92,
"recall": 0.89,
"f1_score": 0.905
}
}
# Save metadata
metadata_path = '/valohai/outputs/valohai.metadata.jsonl'
with open(metadata_path, 'w') as f:
for filename, file_metadata in metadata.items():
json.dump({"file": filename, "metadata": file_metadata}, f)
f.write('\n')
print("Model version created successfully!")
print("Status: Pending (awaiting review)")valohai.yaml:
- step:
name: train-flower-model
image: python:3.9
command:
- pip install scikit-learn pandas
- python train.py
inputs:
- name: training-data
default: dataset://customer-data/train-2024-q1Result:
✅ Model version created in Model Hub
✅ State: Pending (awaiting approval)
✅ Files:
model.pkl,feature_importance.csv✅ Metadata: Metrics, tags, release notes
✅ Lineage: Linked to training execution and data
Model Version States
Every model version has a state in the approval workflow.
Pending (Initial State)
What it means: Newly created, awaiting review
When to use: All new model versions start here
Actions available:
Review metrics and lineage
Compare to previous versions
Approve or reject
Approved
What it means: Validated for production use
When to use: Model meets quality criteria and is ready for deployment
Actions available:
Use in production pipelines
Compare to other approved versions
Revert to pending if issues found
How to approve:
Open model version in UI
Review metrics and artifacts
Click "Approve" button
Add approval notes (optional)

Rejected
What it means: Not suitable for production
When to use: Model fails quality checks, shows bias, or has issues
Actions available:
Document rejection reason
Use rejection notes to inform next iteration
Cannot use in production (intentionally blocked)
How to reject:
Open model version in UI
Click "Reject" button
Required: Add rejection reason
Common reasons: "Overfitting on test set", "Bias detected in predictions", "Worse than baseline"
Model Version Numbers
Versions are automatically numbered sequentially:
model://customer-churn/v1 # First version
model://customer-churn/v2 # Second version
model://customer-churn/v3 # Third versionSpecial aliases:
model://customer-churn/latest # Latest approved version💡 Tip: Use
latestalias for production deployments that should automatically use newest approved version.
Use Models in Workflows
As Execution Input
valohai.yaml:
- step:
name: batch-inference
image: python:3.9
command:
- pip install scikit-learn pandas
- python predict.py
inputs:
- name: model
default: model://customer-churn/v1
- name: inference-data
default: dataset://customer-data/inference-batchpredict.py:
import pickle
import pandas as pd
# Load model from model:// input
model_path = '/valohai/inputs/model/model.pkl'
with open(model_path, 'rb') as f:
model = pickle.load(f)
# Load inference data
data = pd.read_csv('/valohai/inputs/inference-data/data.csv')
# Make predictions
predictions = model.predict(data)
probabilities = model.predict_proba(data)
# Save predictions
results = pd.DataFrame({
'customer_id': data['customer_id'],
'churn_prediction': predictions,
'churn_probability': probabilities[:, 1]
})
results.to_csv('/valohai/outputs/predictions.csv', index=False)
print(f"Generated predictions for {len(results)} customers")Using Latest Approved Version
inputs:
- name: model
default: model://customer-churn/latest # Always uses latest approvedBenefit: Update model version, approve it, and all production pipelines automatically use the new version on next run.
Create Model Version via UI
For existing model files not from training executions:
Navigate to your model
Click "Create Version"
Search for files in data library
Select model file(s)
Add version tags and release notes
Click "Create"
Use cases:
Import externally trained models
Promote experiment checkpoints to model registry
Add models trained outside Valohai
Manage Multiple Files per Version
A model version can contain multiple files:
metadata = {
"model.pkl": {
"valohai.model-versions": ["model://recommendation/"]
},
"preprocessor.pkl": {
"valohai.model-versions": ["model://recommendation/"]
},
"feature_config.json": {
"valohai.model-versions": ["model://recommendation/"]
}
}Result: All three files included in the version:
model://recommendation/v1
├── model.pkl
├── preprocessor.pkl
└── feature_config.jsonAccess in inference:
model = pickle.load(open('/valohai/inputs/model/model.pkl', 'rb'))
preprocessor = pickle.load(open('/valohai/inputs/model/preprocessor.pkl', 'rb'))
config = json.load(open('/valohai/inputs/model/feature_config.json'))Legacy Approach: Sidecar Metadata Files
The older approach used individual .metadata.json files:
import json
# Save model
model.save('/valohai/outputs/model.pkl')
# Create sidecar metadata file
metadata = {
"valohai.model-versions": ["model://customer-churn/"]
}
with open('/valohai/outputs/model.pkl.metadata.json', 'w') as f:
json.dump(metadata, f)This still works, but JSONL format is recommended for:
Consolidating metadata for multiple files
Cleaner outputs directory
Consistency with dataset versioning
Find Model URI
In Model Hub UI:
Navigate to your model
Select a version
Copy model URI from version details panel

Format:
model://<model-name>/<version>Common Workflow: Train → Approve → Deploy
Step 1: Train Model
vh execution run --step train-churn-modelResult: New version created in Pending state
Step 2: Review & Approve
Open Model Hub → Find model
View new version (Pending)
Review:
Training metrics
Lineage (which data was used)
Compare to previous versions
Click "Approve"
Add approval notes: "Approved for staging deployment - 3% improvement in recall"
Result: Version state changed to Approved
Step 3: Deploy
Option A: Update deployment to use new version:
- step:
name: production-inference
inputs:
- name: model
default: model://customer-churn/v5 # Updated from v4Option B: Use latest alias (automatic):
- step:
name: production-inference
inputs:
- name: model
default: model://customer-churn/latest # Auto-updates to v5Result: Production system now uses approved model v5. Model can be used for inference inside Valohai executions or deployed outside Valohai to your serving infrastructure.
Best Practices
Descriptive Model Names
✅ Good:
model://customer-churn
model://fraud-detection-transactions
model://recommendation-engine-products
❌ Avoid:
model://model1
model://my-model
model://testUse Release Notes
# ✅ Good: Informative release notes
{"model_release_note": "Improved recall by 8% using class weights. Addressed bias in age feature."}
# ❌ Avoid: Vague notes
{"model_release_note": "New model"}Tag Strategically
# ✅ Good: Meaningful tags
"model_version_tags": ["gradient-boosting", "production-ready", "q1-2024"]
# ❌ Avoid: Generic tags
"model_version_tags": ["model", "new", "good"]Include Key Metrics
# ✅ Good: Complete metrics
metadata = {
"model.pkl": {
"valohai.model-versions": ["model://churn/"],
"accuracy": 0.94,
"precision": 0.92,
"recall": 0.89,
"f1_score": 0.905,
"auc_roc": 0.96,
"training_samples": 50000,
"test_samples": 12500
}
}Review Before Approving
Checklist:
✅ Metrics better than baseline
✅ No overfitting (train vs test performance similar)
✅ Lineage verified (correct training data)
✅ Fairness/bias checked
✅ Comparison to previous version documented
Related Pages
Models Overview — Why use Model Hub
Model Artifacts & Versioning — Advanced versioning patterns
Add Context to Your Data Files — Metadata system details
Next Steps
Create your first model in Model Hub
Train a model and create a version automatically
Set up approval workflow with your team
Deploy using
model://URIsExplore automated deployment workflows
Last updated
Was this helpful?
