On-Premises NFS
Mount on-premises network file systems to access existing data infrastructure directly from Valohai executions.
When to Use On-Premises NFS
On-premises NFS mounting serves a different purpose than cloud network storage:
Data Already Exists on Network Shares
Use when:
Large datasets already on corporate NFS servers
Legacy systems produce data on network shares
Multiple departments share data on existing file servers
Migrating terabytes of data is impractical
Example workflow:
Medical imagining on hospital NFS
Mount the volume to your execution
Process the data while meeting compliance requirements
Save results to outputs to start tracking them as datums
Everything is versioned and tracked for audit
Data Compliance Requirements
Use when:
Healthcare data must stay in hospital network (HIPAA)
Financial data has regulatory restrictions (PCI DSS, GDPR)
Government data cannot leave controlled environment
Corporate policy prohibits cloud data storage
Hybrid Cloud Strategy
Use when:
Transitioning gradually to cloud
Need access to both on-prem and cloud data
Want to keep sensitive data on-prem while using cloud compute
Cost optimization (avoid cloud storage costs for large static datasets)
Critical Trade-Off: Speed vs. Versioning
⚠️ Important: Valohai does NOT version or track files on mounted network storage.
What this means:
Files read from mounts: Not versioned
Files written to mounts: Not versioned
Files saved to
/valohai/outputs/: Versioned ✅
Decision Tree: Should I Use NFS Mounts?
On-Prem NFS vs. Valohai Inputs
Versioning
❌ No tracking
✅ Full versioning
Reproducibility
❌ Data can change
✅ Immutable references
Data location
✅ Stays on-premises
❌ Must be in cloud storage
Setup complexity
⚠️ Network + VPN config
✅ Simple
Speed
⚠️ Depends on network
✅ Fast (cloud-native)
Best for
Existing on-prem data, compliance
All other cases
Compliance
✅ Data never leaves premises
❌ Data moves to cloud
Recommended Pattern: Mount → Process → Save
Always save processed results to /valohai/outputs/ for versioning:
Why this matters:
Prerequisites
Before mounting on-premises NFS in Valohai:
Network connectivity — Valohai execution environments must reach your on-prem NFS server
VPN or Direct Connect — Secure connection between cloud and on-premises network
NFS server accessible — NFS service running and accessible from Valohai worker IPs
Firewall rules — Allow NFS traffic from Valohai workers
Mount permissions — NFS export configured to allow access from Valohai workers
Mount On-Premises NFS in Execution
Basic Mount Configuration
valohai.yaml:
For networked NFS server:
Parameters:
destination— Mount point inside container (e.g.,/mnt/company-data)source— NFS path (format:<server>:<export-path>or local mount path)type—nfswhen specifying remote serverreadonly—true(recommended) orfalse
Mount Specific NFS Directory
Mounts only a specific subdirectory from your NFS server.
Complete Workflow Example
Mount → Process → Save Pattern
Scenario: Process medical imaging from hospital NFS, extract features, save to Valohai outputs for compliance tracking.
valohai.yaml:
process_scans.py:
Result:
✅ Medical scans accessed from on-prem NFS (data never leaves hospital network)
✅ De-identified metadata and features saved to
/valohai/outputs/(versioned, compliant)✅ Dataset created for reproducible analysis
✅ Audit trail maintained with source tracking
Readonly vs. Writeable Mounts
Readonly Mounts (Recommended)
Use when:
Accessing shared reference data
Reading large datasets for processing
Multiple executions need same data
Want to prevent accidental modifications
Benefits:
✅ Prevents accidental data corruption
✅ Safe for parallel executions
✅ Clear intent (read-only access)
Writeable Mounts (Use Carefully)
Use when:
Need shared scratch space for intermediate results
Writing temporary files shared across parallel workers
Caching expensive computations
Risks:
⚠️ Files written here are NOT versioned
⚠️ Parallel executions can conflict
⚠️ No automatic cleanup
Best practice: Use writeable mounts for temporary data only. Always save final results to /valohai/outputs/.
Best Practices
Use Readonly for Sensitive Data
Always Version Processed Results
Maintaining Reproducibility
⚠️ Critical: On-premises data can change. Always save processed results to
/valohai/outputs/for versioning and audit trails.
The problem:
The solution:
Related Pages
AWS Elastic File System — Cloud NFS for AWS
Google Cloud Filestore — Cloud NFS for GCP
Load Data in Jobs — Alternative: Valohai's versioned inputs
Databases — Access on-prem databases
Next Steps
Set up VPN or Direct Connect between cloud and on-premises
Configure NFS exports and firewall rules
Test connectivity with small execution
Build pipeline: mount → process → save to outputs
Document compliance and data handling procedures
Monitor network performance and optimize access patterns
Last updated
Was this helpful?
