# Aliases

Aliases are mutable pointers to immutable files. They let you reference "the latest validated model" or "the current training file" without hardcoding specific file IDs.

***

### Aliases vs. Direct datum:// Links

Understanding when to use each:

```
Use Alias When:
├─ Reference should update over time
├─ Multiple people need same reference
├─ Managing production/staging versions
└─ "Latest" or "current" semantics needed

Use Direct datum:// Link When:
├─ Need exact reproducibility
├─ Auditing specific file version
├─ Archiving experiment results
└─ Immutable reference required
```

**Key difference:**

* **Aliases** = Update to point to new files (e.g., `model-prod` → new model each week)
* **Direct links** = Always point to same file forever (e.g., `datum://abc123` → immutable)

> :bulb:**Reproducibility!**
>
> When you create an execution, Valohai resolves the alias to a specific `datum://` link at that moment. If the alias changes later, your old execution still references the original file!

***

### How to Add Aliases

Aliases are created through metadata using the reserved key `valohai.alias`.

#### During Execution

```python
import json

# Alias must be a single string (not a list)
metadata = {
    "valohai.alias": "model-prod",
}

# Save your output file
save_path = "/valohai/outputs/model.pkl"
model.save(save_path)

# Save metadata with alias
metadata_path = "/valohai/outputs/model.pkl.metadata.json"
with open(metadata_path, "w") as f:
    json.dump(metadata, f)
```

When this execution completes, the alias `model-prod` will point to this output. If `model-prod` already exists, it updates to point to the new file.

#### Combine with Tags and Properties

```python
import json

metadata = {
    # Alias (single string)
    "valohai.alias": "model-staging",
    # Tags (list of strings)
    "valohai.tags": ["validated", "resnet50", "2024-Q1"],
    # Custom properties
    "accuracy": 0.95,
    "approved_by": "ml-team",
}

save_path = "/valohai/outputs/model.pkl"
model.save(save_path)

metadata_path = "/valohai/outputs/model.pkl.metadata.json"
with open(metadata_path, "w") as f:
    json.dump(metadata, f)
```

> 💡 **For complete details on metadata methods**, see [Add Context to Your Data Files](/data/data-versioning/metadata-overview.md)

***

### Use Aliases as Inputs

#### In valohai.yaml

Set aliases as default inputs for your steps:

```yaml
- step:
    name: batch-inference
    image: python:3.9
    command: python predict.py
    inputs:
      - name: model
        default: datum://model-prod
      - name: data
        default: datum://inference-data-latest
```

Every execution automatically uses the current files referenced by these aliases.

#### In Web Application

When creating an execution, search for your alias name in the input file browser:

<figure><img src="/files/Y1pIcUG8gRFvQ37BuSYW" alt=""><figcaption></figcaption></figure>

The UI shows both the alias name and the current file it points to.

***

### Change Tracking and History

Valohai tracks every time an alias is updated.

**What's tracked:**

* When the alias was changed
* What file it pointed to before
* What file it points to now
* Who made the change (if via UI)

**Why this matters:**

* **Debugging** — "When did model-prod start failing? Let's check when it was last updated."
* **Auditing** — "Which model version was in production on March 15th?"
* **Rollback** — "The new model is worse, point the alias back to the previous version."

View alias history in the **Data → Aliases** tab of your project.

***

### Common Use Cases

#### Production Model References

Point production systems to the current approved model:

```yaml
# Inference pipeline
inputs:
  - name: model
    default: datum://model-prod
```

```python
# Update when promoting new model
metadata = {
    "valohai.alias": "model-prod",
    "valohai.tags": ["validated", "production"],
    "promoted_date": "2024-01-15",
    "previous_model": "datum://abc123",
}
```

**Workflow:**

1. Train new model → tag as `validated`
2. Test in staging → update `model-staging` alias
3. Approve for production → update `model-prod` alias
4. Production automatically uses new model on next run

***

#### Environment-Specific Datasets

Use different aliases for each environment:

```yaml
# Development
inputs:
  - name: training-data
    default: datum://train-data-dev
```

```yaml
# Staging
inputs:
  - name: training-data
    default: datum://train-data-staging
```

```yaml
# Production
inputs:
  - name: training-data
    default: datum://train-data-prod
```

Same pipeline code works across all environments by changing which alias is used.

***

#### Rolling Datasets

Always train on the most recent data:

```yaml
inputs:
  - name: daily-data
    default: datum://latest-processed-batch
```

```python
# After daily preprocessing job
metadata = {
    "valohai.alias": "latest-processed-batch",
    "valohai.tags": ["daily-batch", "2024-01-15"],
    "row_count": 50000,
    "processing_date": "2024-01-15",
}
```

Training jobs automatically pick up the latest data without manual updates.

***

#### A/B Testing

Compare model versions side-by-side:

```yaml
- step:
    name: ab-test
    image: python:3.9
    command: python compare.py
    inputs:
      - name: model-a
        default: datum://model-candidate-a
      - name: model-b
        default: datum://model-candidate-b
      - name: test-data
        default: datum://test-set-fixed
```

Update aliases to test different model combinations without changing pipeline code.

***

#### Canary Deployments

Gradually roll out new models:

```yaml
# 90% traffic
inputs:
  - name: model-stable
    default: datum://model-prod
```

```yaml
# 10% traffic
inputs:
  - name: model-canary
    default: datum://model-canary
```

Monitor canary performance before promoting to full production by updating `model-prod` alias.

***

### Managing Aliases

#### Via Web Application

1. Open your project
2. Navigate to **Data → Aliases** tab
3. Click **"Create new datum alias"** or select existing alias
4. Choose the file the alias should point to
5. View change history for each alias

#### Via Code

Update aliases programmatically when saving outputs:

```python
# Promote staging to production
metadata = {
    "valohai.alias": "model-prod",  # Updates existing alias
    "valohai.tags": ["promoted-from-staging"],
    "promoted_at": "2024-01-15T10:30:00Z",
    "previous_version": "datum://xyz789",
}
```

***

### Best Practices

#### Use Descriptive Names

```python
# Good: Clear purpose
"model-prod"

"model-staging"
"train-data-latest"
"validation-set-fixed"

# Avoid: Ambiguous
"model1"
"data"
"latest"
"temp"
```

#### Version Your Aliases

For complex environments:

```python
# Production versions
"model-prod-v1"

"model-prod-v2"

# Regional variants
"model-prod-us"
"model-prod-eu"

# Use case specific
"model-prod-batch-inference"
"model-prod-real-time-api"
```

#### Document Alias Updates

Include context in metadata:

```python
metadata = {
    "valohai.alias": "model-prod",
    "valohai.tags": ["production", "promoted"],
    "promoted_by": "ml-team",
    "promoted_reason": "10% accuracy improvement",
    "validation_score": 0.95,
    "previous_score": 0.85,
}
```

***

### Related Pages

* [Add Context to Your Data Files](/data/data-versioning/metadata-overview.md) — Overview of metadata system
* [Organize Files with Tags](/data/data-versioning/metadata-overview/tags.md) — Label files before creating aliases
* [Track Custom Metadata](/data/data-versioning/metadata-overview/custom-properties.md) — Store rich data alongside aliases
* [Load Data in Jobs](/data/data-versioning/load-files-in-jobs.md) — Use aliases as execution inputs
* [Versioning and Lineage](/data/data-versioning.md) — Track file dependencies

***

### Next Steps

* Learn how to [use tags](/data/data-versioning/metadata-overview/tags.md) to organize files before aliasing
* Explore [custom properties](/data/data-versioning/metadata-overview/custom-properties.md) for tracking alias metadata
* Set up [datasets](/data/datasets.md) using aliased files


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.valohai.com/data/data-versioning/metadata-overview/aliases.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
