Custom Properties
Custom properties let you attach any structured data to your files in JSON format. Use properties to track experiment results, data quality metrics, processing conditions, or any contextual information your team needs.
When to Use Properties
Properties store rich data beyond simple labels:
Experiment Tracking
Track hyperparameters, metrics, and training details:
{
"model_architecture": "resnet50",
"optimizer": "adam",
"learning_rate": 0.001,
"batch_size": 32,
"epochs": 100,
"final_loss": 0.023,
"accuracy": 0.95,
"precision": 0.93,
"recall": 0.97,
"training_time_minutes": 145
}Data Quality
Record validation results and processing metrics:
{
"input_rows": 10000,
"output_rows": 9850,
"rows_filtered": 150,
"null_percentage": 0.02,
"duplicate_percentage": 0.01,
"quality_score": 0.985,
"validation_passed": true,
"processing_duration_seconds": 45
}Production Context
Capture environmental and operational data:
{
"factory_id": "eu-02",
"production_line": "A",
"batch_number": "2024-Q1-001",
"operator_id": "OP-123",
"temperature_celsius": 22.5,
"humidity_percentage": 45,
"timestamp": "2024-01-15T10:30:00Z",
"quality_check_passed": true
}How to Add Properties
Properties are added through metadata using any custom JSON keys (except reserved valohai.tags ,valohai.alias , valohai.model-versions and valohai.dataset-versions )
During Execution
import json
# Your custom properties (any JSON structure)
metadata = {
"accuracy": 0.95,
"precision": 0.93,
"recall": 0.97,
"epochs": 100,
"learning_rate": 0.001,
"hyperparameters": {
"dropout": 0.5,
"optimizer": "adam",
"batch_size": 32
},
"training_history": [
{"epoch": 1, "loss": 0.5},
{"epoch": 2, "loss": 0.3}
]
}
# Save your output file
save_path = '/valohai/outputs/model.pkl'
model.save(save_path)
# Save metadata with properties
metadata_path = '/valohai/outputs/model.pkl.metadata.json'
with open(metadata_path, 'w') as f:
json.dump(metadata, f)Combine with Tags and Aliases
import json
metadata = {
# Reserved Valohai keys
"valohai.tags": ["validated", "production"],
"valohai.alias": "model-prod",
"valohai.dataset-versions": ["dataset://big-data/latest"],
# Your custom properties
"accuracy": 0.95,
"dataset_version": "v2.3",
"training_duration_minutes": 145,
"gpu_type": "V100",
"experiment_notes": "Increased batch size for better convergence"
}
save_path = '/valohai/outputs/model.pkl'
model.save(save_path)
metadata_path = '/valohai/outputs/model.pkl.metadata.json'
with open(metadata_path, 'w') as f:
json.dump(metadata, f)For Multiple Files
Use valohai.metadata.jsonl for many files:
import json
# Save all output files
for i in range(100):
image.save(f'/valohai/outputs/image_{i:03d}.jpg')
# Add properties to all files
metadata_path = '/valohai/outputs/valohai.metadata.jsonl'
with open(metadata_path, 'w') as f:
for i in range(100):
entry = {
"file": f"image_{i:03d}.jpg",
"metadata": {
"quality_score": quality_scores[i],
"processing_time_ms": processing_times[i],
"resolution": "1920x1080",
"format": "JPEG",
"compression": 85
}
}
json.dump(entry, f)
f.write('\n')💡 For complete details on metadata methods, see Add Context to Your Data Files
Read Properties in Code
Access metadata from input files during execution to make data-driven decisions.
Access Input Metadata
import json
# Load input configuration
with open('/valohai/config/inputs.json') as f:
vh_inputs_config = json.load(f)
# Access properties from input named "model"
for file_data in vh_inputs_config['model']['files']:
metadata = file_data['metadata']
# Use properties for conditional logic
if metadata.get('validation_score', 0) > 0.9:
print(f"High quality model: {file_data['name']}")
model_path = file_data['path']
# Log metadata for tracking
print(f"Model trained with {metadata.get('epochs')} epochs")
print(f"Accuracy: {metadata.get('accuracy')}")Use Cases for Reading Properties
Filter inputs by quality:
# Only process high-quality data
high_quality_files = [
f for f in vh_inputs_config['dataset']['files']
if f['metadata'].get('quality_score', 0) > 0.95
]Conditional processing:
# Use different logic based on data source
if metadata.get('factory_id') == 'eu-02':
apply_eu_preprocessing()
else:
apply_us_preprocessing()Audit trails:
# Log processing context
logging.info(f"Processing batch {metadata.get('batch_number')}")
logging.info(f"Source: {metadata.get('production_line')}")
logging.info(f"Quality: {metadata.get('quality_check_passed')}")Add Properties via API
Add or update properties after execution completes using the Valohai API.
Single Datum
Apply properties to one file:
import os
import requests
properties = {
"validation_score": 0.98,
"approved_by": "data-team",
"approval_date": "2024-01-15",
"notes": "Passed all quality checks"
}
datum_id = "01234567-89ab-cdef-0123-456789abcdef"
response = requests.post(
f'https://app.valohai.com/api/v0/data/{datum_id}/metadata/',
json=properties,
headers={
'Authorization': 'Token ' + os.getenv('VH_TOKEN'),
'Content-Type': 'application/json'
}
)
print(f"Status: {response.status_code}")Multiple Datums (Different Properties)
Apply different properties to each file:
import os
import requests
payload = {
"datum_metadata": {
"datum-id-1": {
"quality_score": 0.95,
"validation_status": "passed",
"reviewer": "alice"
},
"datum-id-2": {
"quality_score": 0.87,
"validation_status": "passed",
"reviewer": "bob"
},
"datum-id-3": {
"quality_score": 0.65,
"validation_status": "failed",
"reviewer": "alice",
"failure_reason": "low accuracy"
}
}
}
response = requests.post(
'https://app.valohai.com/api/v0/data/metadata/apply/',
json=payload,
headers={
'Authorization': 'Token ' + os.getenv('VH_TOKEN'),
'Content-Type': 'application/json'
}
)
print(f"Status: {response.status_code}")Multiple Datums (Same Properties)
Apply the same properties to all files:
import os
import requests
payload = {
"metadata": {
"processing_version": "v2.1",
"validated": True,
"validation_date": "2024-01-15",
"validator": "automated-pipeline"
},
"datum_ids": [
"datum-id-1",
"datum-id-2",
"datum-id-3"
]
}
response = requests.post(
'https://app.valohai.com/api/v0/data/metadata/apply-all/',
json=payload,
headers={
'Authorization': 'Token ' + os.getenv('VH_TOKEN'),
'Content-Type': 'application/json'
}
)
print(f"Status: {response.status_code}")💡 API Token: Get your API token from your Valohai account settings. See Make calls to the Valohai API for details.
Update or Remove Properties
Set property values to None to remove them:
import os
import requests
properties = {
"old_metric": None, # Remove this property
"deprecated_field": None, # Remove this property
"new_metric": 0.92, # Add or update
"validation_status": "re-approved" # Update
}
datum_id = "01234567-89ab-cdef-0123-456789abcdef"
response = requests.post(
f'https://app.valohai.com/api/v0/data/{datum_id}/metadata/',
json=properties,
headers={
'Authorization': 'Token ' + os.getenv('VH_TOKEN'),
'Content-Type': 'application/json'
}
)View Properties
Web Application
Navigate to your project's Data tab
Click on any file to open details
Scroll to the Properties section
Search or filter properties by key
Hover over values to see full content

Common Property Patterns
Experiment Metadata
metadata = {
# Model architecture
"model_type": "resnet50",
"framework": "pytorch",
"framework_version": "2.0.1",
# Hyperparameters
"optimizer": "adam",
"learning_rate": 0.001,
"batch_size": 32,
"epochs": 100,
"dropout": 0.5,
# Results
"final_loss": 0.023,
"accuracy": 0.95,
"precision": 0.93,
"recall": 0.97,
"f1_score": 0.95,
# Resources
"training_time_minutes": 145,
"gpu_type": "V100",
"num_gpus": 4,
# Context
"experiment_id": "exp-042",
"researcher": "alice",
"notes": "Best performing model so far"
}Data Quality Metadata
metadata = {
# Volume
"input_rows": 10000,
"output_rows": 9850,
"rows_filtered": 150,
"rows_deduplicated": 100,
# Quality metrics
"null_percentage": 0.02,
"duplicate_percentage": 0.01,
"outlier_percentage": 0.005,
"quality_score": 0.985,
# Validation
"validation_passed": True,
"validation_checks": [
"schema_valid",
"no_nulls_in_key_columns",
"date_range_valid"
],
# Processing
"processing_duration_seconds": 45,
"data_version": "v2.3",
"processing_date": "2024-01-15"
}Production Metadata
metadata = {
# Source
"factory_id": "eu-02",
"production_line": "A",
"batch_number": "2024-Q1-001",
"operator_id": "OP-123",
# Conditions
"temperature_celsius": 22.5,
"humidity_percentage": 45,
"pressure_mbar": 1013,
# Quality
"quality_check_passed": True,
"defect_rate": 0.002,
"inspector": "QC-456",
# Timestamps
"production_start": "2024-01-15T08:00:00Z",
"production_end": "2024-01-15T16:00:00Z",
"inspection_time": "2024-01-15T16:30:00Z"
}Best Practices
Use Consistent Keys
# Good: Consistent naming
"learning_rate"
"batch_size"
"accuracy"
# Avoid: Inconsistent naming
"learningRate"
"batch-size"
"Accuracy"Structure Nested Data
# Good: Organized structure
{
"hyperparameters": {
"learning_rate": 0.001,
"batch_size": 32,
"optimizer": "adam"
},
"metrics": {
"accuracy": 0.95,
"precision": 0.93,
"recall": 0.97
}
}
# Avoid: Flat and unclear
{
"hp_lr": 0.001,
"hp_bs": 32,
"m_acc": 0.95,
"m_prec": 0.93
}Include Units
# Good: Clear units
{"training_time_minutes": 145}
{"temperature_celsius": 22.5}
{"file_size_mb": 234.5}
# Avoid: Ambiguous
{"training_time": 145} # seconds? minutes? hours?
{"temperature": 22.5} # celsius? fahrenheit?Related Pages
Add Context to Your Data Files — Overview of metadata system
Organize Files with Tags — Use the
valohai.tagspropertyCreate File Shortcuts with Aliases — Use the
valohai.aliaspropertyVersioning and Lineage — Track file dependencies
Next Steps
Last updated
Was this helpful?
