# Aliases

Aliases are mutable pointers to immutable files. They let you reference "the latest validated model" or "the current training file" without hardcoding specific file IDs.

***

### Aliases vs. Direct datum:// Links

Understanding when to use each:

```
Use Alias When:
├─ Reference should update over time
├─ Multiple people need same reference
├─ Managing production/staging versions
└─ "Latest" or "current" semantics needed

Use Direct datum:// Link When:
├─ Need exact reproducibility
├─ Auditing specific file version
├─ Archiving experiment results
└─ Immutable reference required
```

**Key difference:**

* **Aliases** = Update to point to new files (e.g., `model-prod` → new model each week)
* **Direct links** = Always point to same file forever (e.g., `datum://abc123` → immutable)

> :bulb:**Reproducibility!**
>
> When you create an execution, Valohai resolves the alias to a specific `datum://` link at that moment. If the alias changes later, your old execution still references the original file!

***

### How to Add Aliases

Aliases are created through metadata using the reserved key `valohai.alias`.

#### During Execution

```python
import json

# Alias must be a single string (not a list)
metadata = {
    "valohai.alias": "model-prod",
}

# Save your output file
save_path = "/valohai/outputs/model.pkl"
model.save(save_path)

# Save metadata with alias
metadata_path = "/valohai/outputs/model.pkl.metadata.json"
with open(metadata_path, "w") as f:
    json.dump(metadata, f)
```

When this execution completes, the alias `model-prod` will point to this output. If `model-prod` already exists, it updates to point to the new file.

#### Combine with Tags and Properties

```python
import json

metadata = {
    # Alias (single string)
    "valohai.alias": "model-staging",
    # Tags (list of strings)
    "valohai.tags": ["validated", "resnet50", "2024-Q1"],
    # Custom properties
    "accuracy": 0.95,
    "approved_by": "ml-team",
}

save_path = "/valohai/outputs/model.pkl"
model.save(save_path)

metadata_path = "/valohai/outputs/model.pkl.metadata.json"
with open(metadata_path, "w") as f:
    json.dump(metadata, f)
```

> 💡 **For complete details on metadata methods**, see [Add Context to Your Data Files](https://docs.valohai.com/data/data-versioning/metadata-overview)

***

### Use Aliases as Inputs

#### In valohai.yaml

Set aliases as default inputs for your steps:

```yaml
- step:
    name: batch-inference
    image: python:3.9
    command: python predict.py
    inputs:
      - name: model
        default: datum://model-prod
      - name: data
        default: datum://inference-data-latest
```

Every execution automatically uses the current files referenced by these aliases.

#### In Web Application

When creating an execution, search for your alias name in the input file browser:

<figure><img src="https://4109720758-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2Ff3mjTRQNkASbnMbJqzJ2%2Fuploads%2Fgit-blob-6bc8f86b4b0711726b30825e3f2a63c5f70803a1%2Fimage.png?alt=media" alt=""><figcaption></figcaption></figure>

The UI shows both the alias name and the current file it points to.

***

### Change Tracking and History

Valohai tracks every time an alias is updated.

**What's tracked:**

* When the alias was changed
* What file it pointed to before
* What file it points to now
* Who made the change (if via UI)

**Why this matters:**

* **Debugging** — "When did model-prod start failing? Let's check when it was last updated."
* **Auditing** — "Which model version was in production on March 15th?"
* **Rollback** — "The new model is worse, point the alias back to the previous version."

View alias history in the **Data → Aliases** tab of your project.

***

### Common Use Cases

#### Production Model References

Point production systems to the current approved model:

```yaml
# Inference pipeline
inputs:
  - name: model
    default: datum://model-prod
```

```python
# Update when promoting new model
metadata = {
    "valohai.alias": "model-prod",
    "valohai.tags": ["validated", "production"],
    "promoted_date": "2024-01-15",
    "previous_model": "datum://abc123",
}
```

**Workflow:**

1. Train new model → tag as `validated`
2. Test in staging → update `model-staging` alias
3. Approve for production → update `model-prod` alias
4. Production automatically uses new model on next run

***

#### Environment-Specific Datasets

Use different aliases for each environment:

```yaml
# Development
inputs:
  - name: training-data
    default: datum://train-data-dev
```

```yaml
# Staging
inputs:
  - name: training-data
    default: datum://train-data-staging
```

```yaml
# Production
inputs:
  - name: training-data
    default: datum://train-data-prod
```

Same pipeline code works across all environments by changing which alias is used.

***

#### Rolling Datasets

Always train on the most recent data:

```yaml
inputs:
  - name: daily-data
    default: datum://latest-processed-batch
```

```python
# After daily preprocessing job
metadata = {
    "valohai.alias": "latest-processed-batch",
    "valohai.tags": ["daily-batch", "2024-01-15"],
    "row_count": 50000,
    "processing_date": "2024-01-15",
}
```

Training jobs automatically pick up the latest data without manual updates.

***

#### A/B Testing

Compare model versions side-by-side:

```yaml
- step:
    name: ab-test
    image: python:3.9
    command: python compare.py
    inputs:
      - name: model-a
        default: datum://model-candidate-a
      - name: model-b
        default: datum://model-candidate-b
      - name: test-data
        default: datum://test-set-fixed
```

Update aliases to test different model combinations without changing pipeline code.

***

#### Canary Deployments

Gradually roll out new models:

```yaml
# 90% traffic
inputs:
  - name: model-stable
    default: datum://model-prod
```

```yaml
# 10% traffic
inputs:
  - name: model-canary
    default: datum://model-canary
```

Monitor canary performance before promoting to full production by updating `model-prod` alias.

***

### Managing Aliases

#### Via Web Application

1. Open your project
2. Navigate to **Data → Aliases** tab
3. Click **"Create new datum alias"** or select existing alias
4. Choose the file the alias should point to
5. View change history for each alias

#### Via Code

Update aliases programmatically when saving outputs:

```python
# Promote staging to production
metadata = {
    "valohai.alias": "model-prod",  # Updates existing alias
    "valohai.tags": ["promoted-from-staging"],
    "promoted_at": "2024-01-15T10:30:00Z",
    "previous_version": "datum://xyz789",
}
```

***

### Best Practices

#### Use Descriptive Names

```python
# Good: Clear purpose
"model-prod"

"model-staging"
"train-data-latest"
"validation-set-fixed"

# Avoid: Ambiguous
"model1"
"data"
"latest"
"temp"
```

#### Version Your Aliases

For complex environments:

```python
# Production versions
"model-prod-v1"

"model-prod-v2"

# Regional variants
"model-prod-us"
"model-prod-eu"

# Use case specific
"model-prod-batch-inference"
"model-prod-real-time-api"
```

#### Document Alias Updates

Include context in metadata:

```python
metadata = {
    "valohai.alias": "model-prod",
    "valohai.tags": ["production", "promoted"],
    "promoted_by": "ml-team",
    "promoted_reason": "10% accuracy improvement",
    "validation_score": 0.95,
    "previous_score": 0.85,
}
```

***

### Related Pages

* [Add Context to Your Data Files](https://docs.valohai.com/data/data-versioning/metadata-overview) — Overview of metadata system
* [Organize Files with Tags](https://docs.valohai.com/data/data-versioning/metadata-overview/tags) — Label files before creating aliases
* [Track Custom Metadata](https://docs.valohai.com/data/data-versioning/metadata-overview/custom-properties) — Store rich data alongside aliases
* [Load Data in Jobs](https://docs.valohai.com/data/data-versioning/load-files-in-jobs) — Use aliases as execution inputs
* [Versioning and Lineage](https://docs.valohai.com/data/data-versioning) — Track file dependencies

***

### Next Steps

* Learn how to [use tags](https://docs.valohai.com/data/data-versioning/metadata-overview/tags) to organize files before aliasing
* Explore [custom properties](https://docs.valohai.com/data/data-versioning/metadata-overview/custom-properties) for tracking alias metadata
* Set up [datasets](https://docs.valohai.com/data/datasets) using aliased files
