The Productivity Dashboard provides a comprehensive overview of your organizations performance, giving you key insights into efficiency, resource utilization, and operational health. This dashboard helps you track progress, identify bottlenecks, and make data-driven decisions to optimize your machine learning workflows. The data is displayed for the last 30 days by default, but you can use the date range selector to view data for different periods.
Date Range Selection
Use the date range selector on the top-right of the dashboard to view the data of a custom period.
Top-Level Metrics
The top row of the dashboard displays four critical metrics:
Time to Value (TTV)
This metric shows the average number of days it takes for a project to go from initial kickoff to the first completed job and an approved model version. A lower TTV indicates faster development cycles. The trend indicator shows how the TTV has changed during the selected period. Learn more about using models in Valohai.
Job Reuse Savings
This section shows the financial and time benefits of reusing matching configuration nodes from past executions. Trend indicators show how both cost and time savings have changed during the selected period. Learn more about reusing nodes in Valohai.
- Cost Savings: The total amount of money saved by reusing jobs instead of running new ones.
- Time Savings: The total amount of time saved by reusing jobs.
Automated Pipeline Success Rate
This metric shows the percentage of automated pipelines that completed successfully. Automated pipelines are those that are triggered via scheduled task or webhook. A higher success rate indicates greater operational reliability. The trend indicator shows how the success rate has changed during the selected period.
Detailed Statistics
Below the key metrics, you’ll find more detailed statistics grouped into several sections:
Project Cost Breakdown
This chart shows the total compute cost associated with the five most expensive projects. Projects are listed in descending order of cost, allowing you to quickly identify the most expensive project. This helps you understand where your resources are being allocated and identify potential areas for cost optimization.
Avg. GPU Utilization
This section provides insights into GPU utilization.
- Gauge: Shows the average GPU utilization across all jobs. Higher utilization generally indicates better resource efficiency.
- Counter: Displays the number of GPU utilization alerts triggered during the selected period. A high number of alerts might indicate underutilized resources or potential bottlenecks.
Total Run Jobs
This chart displays the total number of jobs run for the five largest projects, providing a clear view of workload distribution.
Job Reuse Rate
This chart shows the percentage of jobs that were reused for the five largest projects. A higher reuse rate indicates greater efficiency, as it reduces the need to run new jobs for repeated tasks.
Peak Waiting Time (per environment)
This chart shows the maximum time a job spent waiting in the queue before starting, broken down by five slowest environments. This helps identify potential bottlenecks in specific environments. Shorter wait times indicate more responsive infrastructure.
Pipeline Status Overview
This chart provides a breakdown of the status of all pipelines run during the selected period. It shows the percentage of pipelines that were:
- Completed: Successfully finished.
- Stopped: Manually stopped by a user.
- Error: Encountered an error and failed to complete.
The total number of pipelines run for each status is also displayed next to its percentage.
Most used datasets
This chart displays five most used datasets in your projects. This helps you identify the most valuable and critical datasets.
Data Provenance Tracking
This chart shows the percentage of datasets for which data lineage is being tracked. Tracked data is files being used by executions as inputs or outputs, while untracked data is loose files that are not being used in executions.
- Tracked: Datasets with complete data lineage information.
- Untracked: Datasets without complete data lineage information.
Higher tracking coverage is crucial for data governance and reproducibility.
Model Reproducibility
This chart shows the percentage of models that are fully reproducible, meaning they can be recreated from their original code, data, and environment. Models that are imported and trained outside of Valohai are not considered reproducible. A high reproducibility rate is essential for ensuring the reliability and auditability of your models.