Files managed by Valohai are fully version-controlled.
If an execution uploads a new processed dataset, a model, or any other file, it will never overwrite any previous files and will be stored indefinitely if not manually removed e.g. with our “purge” actions.
This feature is automatic if you use Data Stores we support: AWS S3, Azure Blob Storage, Google Cloud Storage, or OpenStack Swift.
Execution Tracking and Metadata
Valohai automatically tracks essential information about your executions, simplifying experiment reproduction and comprehension.
Execution details
Access comprehensive details about all past executions:
- Resource usage: How much CPU, Memory and GPU your job is using
- Additional views to logs, metadata and generated outputs
- Commit information (for Git-based executions), hardware configuration, Docker images, executed commands
- Input data files
- Parameters
- Creator, duration of the job and costs
Metadata
Collect custom metadata for executions, such as performance metrics, enhancing your understanding of each run (number 1 in the picture). View this data on the Metadata tab within each execution. In addition there is a metadata table below the visualization (number 3 in the picture)
You can also download metadata in CSV or JSON format directly from the executions metadata tab (number 2 in the picture).
Trace files
Easily trace files connected to Valohai (inputs/outputs) to understand their generation and dependencies. Visualize how files were created and identify which executions and deployments rely on them in the project’s data tab.
Tags
Use tags to categorize and identify specific executions. Tags are helpful for highlighting important runs or aiding team members in locating particular executions, reducing the need to sift through numerous experiments. Set tags on the Details tab of each execution.
Aliases
Aliases in Valohai are pointers to specific versions of files, providing convenient ways to reference essential resources. For instance, you can have aliases like latest-clean-dataset
or production-model
.
Valohai automatically versions and tracks each update to an alias, maintaining a history of changes.
You can create aliases programmatically, directly within the web app, or through the REST API.
- Alias name
- Link to the history log
- History and change log of the alias
Datasets
Datasets in Valohai are references to specific versions of file collections, simplifying data management. For instance, you can create datasets named cats
or factory-g4-data
.
You can easily add or remove files from a dataset. Each modification generates a new dataset version, which Valohai automatically tracks.
- Name of the dataset
- Dataset aliases
- Dataset version history