# AWS Elastic File System

Mount AWS Elastic File System (EFS) to access shared network storage directly from Valohai executions.

***

### Overview

AWS EFS provides managed NFS storage that you can mount in Valohai executions queued for AWS environments (ec2 instance).

Use EFS to:

* Access large datasets without downloading
* Share preprocessed data across multiple executions
* Cache intermediate results on fast shared storage
* Process data in place and save versioned outputs

> ⚠️ **Important:** Files on EFS mounts are NOT versioned by Valohai. Always save final results to `/valohai/outputs/` for reproducibility.

***

### Prerequisites

Before mounting EFS in Valohai:

1. **Existing EFS file system** — Use an existing EFS or create a new one in AWS Console
2. **Same VPC or VPC peering** — EFS must be in the same VPC as Valohai resources, or set up VPC peering between VPCs
3. **Security group access** — Configure EFS security group to allow inbound NFS traffic (port 2049) from Valohai workers security group (`sg-valohai-workers`)
4. **DNS enabled** — If connecting via DNS name, ensure DNS hostnames and DNS resolution are enabled in your VPC

***

### Setup: Configure EFS Access

#### Step 1: Find Your EFS Details

In AWS Console:

1. Go to **EFS → File systems**
2. Find your file system
3. Note the **File system ID** (e.g., `fs-1234aa62`)
4. Note the **DNS name** (e.g., `fs-1234aa62.efs.eu-west-1.amazonaws.com`)
5. Check the **Mount targets** tab for availability zone placement

***

#### Step 2: Configure Security Group

1. In AWS Console, go to **EC2 → Security Groups**
2. Find your EFS security group (or create one)
3. Add inbound rule:
   * **Type:** NFS
   * **Protocol:** TCP
   * **Port:** 2049
   * **Source:** `sg-valohai-workers` (Valohai workers security group)
4. Save rules

***

#### Step 3: Verify VPC Configuration

Ensure your VPC has DNS support enabled:

1. Go to **VPC → Your VPCs**
2. Select your VPC
3. Click **Actions → Edit VPC settings**
4. Verify both are enabled:
   * ✅ Enable DNS resolution
   * ✅ Enable DNS hostnames

***

### Mount EFS in Execution

#### Basic Mount Configuration

**valohai.yaml:**

```yaml
- step:
    name: process-with-efs
    image: python:3.9
    command:
      - python process_data.py
    mounts:
      - destination: /mnt/efs-data
        source: fs-1234aa62.efs.eu-west-1.amazonaws.com:/
        type: nfs
        readonly: true
```

**Parameters:**

* `destination` — Mount point inside container (e.g., `/mnt/efs-data`)
* `source` — EFS DNS name with path (format: `<file-system-id>.efs.<region>.amazonaws.com:/[path]`)
* `type` — Always `nfs` for EFS
* `readonly` — `true` (recommended) or `false`

***

#### Mount Specific EFS Directory

```yaml
mounts:
  - destination: /mnt/training-data
    source: fs-1234aa62.efs.eu-west-1.amazonaws.com:/ml-datasets/training
    type: nfs
    readonly: true
```

Mounts only the `/ml-datasets/training` directory from EFS.

***

### Complete Workflow Example

#### Mount → Process → Save Pattern

**Scenario:** Preprocess large image dataset stored on EFS, save processed results to Valohai outputs.

**valohai.yaml:**

```yaml
- step:
    name: preprocess-images-from-efs
    image: python:3.9
    command:
      - pip install pillow pandas
      - python preprocess.py
    mounts:
      - destination: /mnt/raw-images
        source: fs-abc123.efs.us-east-1.amazonaws.com:/datasets/imagenet
        type: nfs
        readonly: true
    environment-variables:
      - name: BATCH_SIZE
        default: "1000"
```

**preprocess.py:**

```python
import os
from PIL import Image
import pandas as pd
import json

# Configuration
EFS_PATH = "/mnt/raw-images/"
OUTPUT_PATH = "/valohai/outputs/"
BATCH_SIZE = int(os.getenv("BATCH_SIZE", "1000"))

# 1. Read from EFS mount (NOT versioned)
print(f"Scanning EFS mount: {EFS_PATH}")
image_files = [f for f in os.listdir(EFS_PATH) if f.endswith((".jpg", ".png"))]
print(f"Found {len(image_files)} images on EFS")

# 2. Process images in batches
processed_dir = os.path.join(OUTPUT_PATH, "processed_images")
os.makedirs(processed_dir, exist_ok=True)

metadata_records = []

for i, filename in enumerate(image_files[:BATCH_SIZE]):
    # Read from EFS
    input_path = os.path.join(EFS_PATH, filename)
    img = Image.open(input_path)

    # Process (resize, normalize, etc.)
    img_processed = img.resize((224, 224))
    img_processed = img_processed.convert("RGB")

    # Save to Valohai outputs (VERSIONED ✅)
    output_filename = f"processed_{filename}"
    output_path = os.path.join(processed_dir, output_filename)
    img_processed.save(output_path, quality=95)

    # Track metadata
    metadata_records.append(
        {
            "original_filename": filename,
            "processed_filename": output_filename,
            "original_size": img.size,
            "processed_size": img_processed.size,
            "format": img.format,
        },
    )

    if (i + 1) % 100 == 0:
        print(f"Processed {i + 1}/{len(image_files)} images...")

# 3. Save processing metadata
df = pd.DataFrame(metadata_records)
metadata_csv_path = os.path.join(OUTPUT_PATH, "processing_metadata.csv")
df.to_csv(metadata_csv_path, index=False)

# 4. Create dataset version
dataset_metadata = {}

# Add all processed images to dataset
for record in metadata_records:
    file_path = f"processed_images/{record['processed_filename']}"
    dataset_metadata[file_path] = {
        "valohai.dataset-versions": [
            {
                "uri": "dataset://imagenet-processed/batch-001",
            },
        ],
        "valohai.tags": ["preprocessed", "imagenet", "224x224"],
    }

# Add metadata CSV
dataset_metadata["processing_metadata.csv"] = {
    "valohai.dataset-versions": [
        {
            "uri": "dataset://imagenet-processed/batch-001",
        },
    ],
}

# Save dataset metadata
metadata_jsonl_path = os.path.join(OUTPUT_PATH, "valohai.metadata.jsonl")
with open(metadata_jsonl_path, "w") as f:
    for filename, file_meta in dataset_metadata.items():
        json.dump({"file": filename, "metadata": file_meta}, f)
        f.write("\n")

print(f"\nProcessing complete:")
print(f"  - Processed {len(metadata_records)} images")
print(f"  - Saved to: {processed_dir}")
print(f"  - Created dataset: dataset://imagenet-processed/batch-001")
```

**Result:**

* ✅ Raw images accessed from EFS (no download time)
* ✅ Processed images saved to `/valohai/outputs/` (versioned)
* ✅ Dataset created for reproducible training
* ✅ Can train on `dataset://imagenet-processed/batch-001` anytime

***

### Best Practices

#### Use Readonly for Input Data

```yaml
# ✅ Good: Readonly prevents accidental modifications
mounts:
  - destination: /mnt/training-data
    readonly: true
```

```yaml
# ⚠️ Avoid: Writeable unless necessary
mounts:
  - destination: /mnt/training-data
    readonly: false
```

***

#### Always Version Final Results

```python
# ❌ Bad: Only use EFS, nothing versioned
results = process_data("/mnt/efs-data/")
results.save("/mnt/efs-output/results.pkl")  # NOT versioned

# ✅ Good: EFS for input, outputs for results
results = process_data("/mnt/efs-data/")
results.save("/valohai/outputs/results.pkl")  # VERSIONED
```

***

#### Structure Your EFS Data

```
/
├── raw-data/
│   ├── images/
│   ├── videos/
│   └── text/
├── feature-cache/
│   └── features_v1.pkl
└── intermediate/
    └── temp_processing/
```

Organize data logically for easier mounting and access control.

***

#### Monitor EFS Usage

Check EFS metrics in AWS CloudWatch:

* **Burst credit balance** — Ensure you're not exhausting bursting capacity
* **Throughput utilization** — Monitor if hitting limits
* **IOPS utilization** — Check file operation patterns

***

#### Handle Mount Errors

```python
import os
import sys

EFS_PATH = "/mnt/efs-data/"

# Verify mount is accessible
if not os.path.exists(EFS_PATH):
    print(f"ERROR: EFS mount {EFS_PATH} not accessible")
    print("Possible causes:")
    print("  - Security group not configured")
    print("  - VPC/network connectivity issue")
    print("  - Wrong EFS file system ID")
    sys.exit(1)

# Verify expected structure
expected_dir = os.path.join(EFS_PATH, "datasets")
if not os.path.exists(expected_dir):
    print(f"WARNING: Expected directory not found: {expected_dir}")
    print(f"Available directories: {os.listdir(EFS_PATH)}")

print(f"EFS mount verified: {EFS_PATH}")
```

***

### Maintaining Reproducibility

> ⚠️ **Critical:** EFS data can change between executions. Always save processed results to `/valohai/outputs/` for versioning.

**The problem:**

```python
# Today: Process data from EFS
data = load_from_efs("/mnt/efs-data/")
model = train(data)

# Next week: Someone updates EFS data
# Retraining gives different results
# Can't reproduce original model
```

**The solution:**

```python
# Load from EFS (current state)
data = load_from_efs("/mnt/efs-data/")

# Save snapshot to versioned outputs
data.to_csv("/valohai/outputs/training_snapshot.csv")

# Create dataset version
metadata = {
    "training_snapshot.csv": {
        "valohai.dataset-versions": [
            {
                "uri": "dataset://training-data/2024-01-15",
            },
        ],
    },
}

# Train on versioned snapshot in next execution
# Can reproduce exactly anytime
```

**See:** [Access Network Storage](/data/data-nfs.md) for complete patterns.

***

### Related Pages

* [Access Network Storage](/data/data-nfs.md) — Overview and when to use NFS
* [Google Cloud Filestore](/data/data-nfs/google-filestore.md) — GCP equivalent
* [On-Premises NFS](/data/data-nfs/onprem-nfs.md) — Mount on-prem storage
* [Load Data in Jobs](/data/data-versioning/load-files-in-jobs.md) — Alternative: Valohai's versioned inputs

***

### Next Steps

* Set up EFS in your AWS account (or use existing)
* Configure security groups for Valohai access
* Create test execution mounting EFS
* Build [pipeline](/pipelines.md): mount → process → [save to outputs](/data/data-versioning/save-files-from-jobs.md)
* Monitor EFS performance metrics


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.valohai.com/data/data-nfs/aws-efs.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
