Tags

Tags are simple text labels that help you categorize and quickly find files across your project.

When to Use Tags

Use tags for:

Experiment organization — Label outputs by experiment ID, iteration, or variant
Quality tracking — Mark files as validated, rejected, or needs-review
Environment management — Distinguish between dev, staging, and production data
Data sources — Track which dataset, factory, or pipeline produced the file
Team ownership — Identify which team or person is responsible
Model metadata — Label by architecture, framework, or approach

How to Add Tags

Tags are added through metadata using the reserved key valohai.tags.

During Execution

import json

# Tags are a list of strings
metadata = {
    "valohai.tags": ["validated", "production", "experiment-42"]
}

# Save your output file
save_path = '/valohai/outputs/model.pkl'
model.save(save_path)

# Save metadata with tags
metadata_path = '/valohai/outputs/model.pkl.metadata.json'
with open(metadata_path, 'w') as f:
    json.dump(metadata, f)

For Multiple Files

Use valohai.metadata.jsonl when tagging many files:

import json

# Save all output files
for i in range(100):
    image.save(f'/valohai/outputs/image_{i:03d}.jpg')

# Tag all files at once
metadata_path = '/valohai/outputs/valohai.metadata.jsonl'
with open(metadata_path, 'w') as f:
    for i in range(100):
        entry = {
            "file": f"image_{i:03d}.jpg",
            "metadata": {
                "valohai.tags": ["processed", "batch-2024-Q1", "validated"]
            }
        }
        json.dump(entry, f)
        f.write('\n')

💡 For complete details on metadata methods, see Add Context to Your Data Files

Find Files Using Tags

Web Application

Search and filter by tags in the Valohai UI:

Navigate to your project's Data tab
Use the search bar to find files by tag name
Click on any file to see all its tags
Filter by multiple tags to narrow results

When creating executions, search for tagged files in the input browser.

Common Tagging Patterns

Experiment Tracking

Label outputs by experiment details:

metadata = {
    "valohai.tags": [
        "experiment-123",
        "baseline",
        "resnet50",
        "2024-Q1"
    ]
}

Track which experiment produced each output and compare results across iterations.

Quality Labels

Mark validation and approval status:

metadata = {
    "valohai.tags": [
        "validated",
        "accuracy-95",
        "production-ready",
        "approved-by-ml-team"
    ]
}

Quickly identify files that have passed quality gates or are ready for production.

Environment Management

Separate outputs by deployment stage:

metadata = {
    "valohai.tags": [
        "staging",
        "preprocessing-v2",
        "ready-for-testing"
    ]
}

Keep development, staging, and production data clearly separated.

Data Source Tracking

Label files by origin:

metadata = {
    "valohai.tags": [
        "factory-eu-02",
        "production-line-A",
        "sensor-data",
        "quality-checked"
    ]
}

Track where data came from for traceability and compliance.

Team Ownership

Identify responsible teams or individuals:

metadata = {
    "valohai.tags": [
        "team-ml-research",
        "owner-alice",
        "project-vision",
        "requires-review"
    ]
}

Help teams coordinate on shared data and outputs.

Model Architecture

Organize models by technical approach:

metadata = {
    "valohai.tags": [
        "transformer",
        "pytorch",
        "fine-tuned",
        "multilingual"
    ]
}

Filter and compare models by architecture or framework.

Combining Multiple Categories

metadata = {
    "valohai.tags": [
        # Experiment
        "experiment-456",
        
        # Quality
        "validated",
        
        # Environment
        "staging",
        
        # Architecture
        "resnet50",
        
        # Team
        "team-cv",
        
        # Source
        "dataset-imagenet-v2"
    ]
}

Use multiple tag categories for flexible filtering and organization.

Best Practices

Use Consistent Naming Conventions

# Good: Consistent, predictable patterns
"experiment-123"
"experiment-124"
"team-ml-research"
"team-data-engineering"

# Avoid: Inconsistent naming
"exp123"
"experiment_124"
"ML Research Team"
"data-eng"

Keep Tags Concise

# Good: Short, scannable
"validated"
"prod"
"v2"

# Avoid: Long, verbose
"this-file-has-been-validated-by-the-quality-team"
"production-environment-version-2"

Use Tags for Filtering, Not Storage

# Good: Tags for categories
{"valohai.tags": ["validated", "high-accuracy"]}

# Wrong: Don't store data in tags
{"valohai.tags": ["accuracy=0.95", "epochs=100"]}  # Use properties instead

For detailed metrics, use custom properties instead.

Add Context to Your Data Files — Overview of metadata system
Create File Shortcuts with Aliases — Stable pointers to tagged files
Track Custom Metadata — Store rich data alongside tags
Versioning and Lineage — Track file dependencies

Next Steps

Learn how to create aliases pointing to tagged files
Explore custom properties for detailed metadata
Set up datasets to group tagged files

PreviousAdd Context to Your Files NextAliases

Last updated 5 hours ago

Was this helpful?