Using Valohai Ecosystem Libraries

Valohai maintains a collection of pre-built library steps for common workflows. These are production-ready and available to all organizations.

Available libraries

Database Connectors

Run SQL queries and save results to your data store:

All database connectors:

  • Accept SQL queries as parameters

  • Save results as CSV outputs

  • Support both credential-based and machine identity authentication

  • Automatically version query results in your data store

Docker Image Builder

Build and push Docker images without installing Docker locally:

Supports AWS ECR, GCP Artifact Registry, and Docker Hub.

How to use ecosystem libraries

Ecosystem libraries are automatically available in your organization. No setup required.

Run a library step

  1. Open your project

  2. Click Create Execution under the Executions tab

  3. Expand the step library by clicking the + next to valohai-ecosystem in the left panel

  4. Select a library step (e.g., bigquery-query)

  5. Configure parameters and environment variables

  6. Click Create Execution

Library steps run like any other execution—same logs, same outputs, same metadata tracking.

Example: Query BigQuery

Let's run a BigQuery query and save the results:

Add environment variables

Under your project Settings or as an organization-wide environment variable group:

  • GCP_PROJECT — Your GCP project ID

  • GCP_IAM — Set to 1 to use machine identity, or 0 for keyfile auth

  • GCP_KEYFILE_CONTENTS_JSON — (If using keyfile) Service account JSON

Create the execution

  1. Select the bigquery-query step from valohai-ecosystem

  2. Write your SQL query in the query parameter:

SELECT user_id, COUNT(*) as events
FROM `my-project.analytics.events`
WHERE date >= '2025-01-01'
GROUP BY user_id
ORDER BY events DESC
LIMIT 100
  1. (Optional) Set an output path like top_users.csv

  2. (Optional) Add a datum alias like latest-user-stats for easy reference

  3. Click Create Execution

The query runs on BigQuery, and results are saved to your data store. Use the output in other executions with datum://latest-user-stats.

Why use ecosystem libraries?

No setup needed: No YAML to write, no Git repository to manage. Just run.

Battle-tested: These steps are maintained by Valohai and used across hundreds of organizations.

Versioned results: Query outputs are automatically tracked and versioned in your data store.

Consistent patterns: All connectors work the same way—write a query, get a CSV. Easy to learn once and reuse everywhere.

Next steps

Database connectors:

Build custom images:

Create your own:

Last updated

Was this helpful?