Executions Reuse is a functionality within Valohai pipelines that enables the reuse of intermediate results from previous pipeline executions. It is a form of pipeline caching that helps eliminate redundant computations and speeds up pipeline execution by utilizing results from identical steps, provided that the input data, parameters, and source code have not changed.
Understanding Executions Reuse
Executions Reuse allows for the recycling of results from steps in a pipeline when the associated input data, parameters, and source code match those of a previous execution (regardless of whether it was part of a pipeline or not). This process is enabled by Valohai’s core system, which tracks all the aforementioned components to ensure reproducibility and consistency across executions.
By default, Valohai tracks each execution’s input data, parameters, source code, and outputs. When the reuse-executions flag is set to true in the pipeline configuration, and a step is initiated, the system checks if the input data, parameters, and source code match those of a previous execution. If a match is found, the results from the previous execution are reused, saving time and resources.
Configuring Executions Reuse
There are two ways to enable Executions Reuse in Valohai pipelines: through YAML configuration (which would apply to every pipeline run) or it can be enabled or disabled in the UI settings for specific pipeline runs.
YAML Configuration
To activate Executions Reuse, update the valohai.yaml configuration file by adding the reuse-executions: true flag to the pipeline definition. Below is an example configuration:
- pipeline:
name: acme-pipeline
reuse-executions: true
nodes:
- name: preprocess
type: execution
step: preprocess-dataset
- name: train1
type: execution
step: train-model
override:
...
UI Settings
Executions Reuse can also be enabled or disabled via the Valohai platform’s user interface during the setup or editing of a pipeline.