Fetch Failed Executions

Query your failed executions programmatically to build automated retry workflows, monitoring dashboards, or failure analysis tools.

This example shows how to filter executions by status and extract failure details, perfect for building resilient ML pipelines that automatically recover from transient errors.

Prerequisites

  • A Valohai project with some executions (successful or failed)

  • Python 3.8+ and the requests library installed

Generate a Valohai API Token

Create an API token to authenticate your requests:

  1. Click "Hi, <username>!" in the top-right corner

  2. Go to My Profile → Authentication

  3. Click Manage Tokens and scroll to the bottom

  4. Click Generate New Token

  5. Copy the token immediately — it's shown only once

Store your token as an environment variable:

# Linux/Mac
export VH_API_TOKEN=your_token_value

# Windows
set VH_API_TOKEN=your_token_value

Get Your Project ID

Find your project ID in Project Settings → General

Fetch Failed Executions

Create fetch_failed_executions.py:

import requests
import json
import os

# Authenticate with your API token
auth_token = os.environ['VH_API_TOKEN']
headers = {'Authorization': f'Token {auth_token}'}

# Replace with your project ID
project_id = 'your-project-id-here'

# Fetch only failed executions
url = f'https://app.valohai.com/api/v0/executions/?project={project_id}&status=error'
resp = requests.get(url, headers=headers)
resp.raise_for_status()

# Print the response
print('Failed Executions:\n')
print(json.dumps(resp.json(), indent=4))

Run the script:

python fetch_failed_executions.py

Understanding the Response

The API returns a paginated list of executions. Each failed execution includes:

{
    "count": 3,
    "next": null,
    "previous": null,
    "results": [
        {
            "id": "exec-id-here",
            "counter": 42,
            "status": "error",
            "step": "train-model",
            "project": "project-id-here",
            "commit": "abc123...",
            "ctime": "2025-10-29T10:30:00Z",
            "duration": 125,
            "error_text": "Connection timeout during data download",
            "urls": {
                "display": "https://app.valohai.com/p/owner/project/execution/exec-id/"
            }
        }
    ]
}

Key fields:

  • status: Always "error" when filtered by status=error

  • error_text: Human-readable failure reason

  • step: Which step failed

  • duration: How long it ran before failing (in seconds)

  • urls.display: Direct link to the execution in Valohai

Status Filters

Query executions by different statuses:

# All failed executions
status=error

# All successful executions
status=completed

# Manually stopped executions
status=stopped

# Currently running executions
status=started

Next Steps

Now that you can fetch failed executions, you can:

Troubleshooting

Empty results:

  • Verify your project ID is correct

  • Check that you have executions with status=error

  • Try removing the status filter to see all executions

Authentication errors:

  • Confirm VH_API_TOKEN environment variable is set

  • Verify your token is valid in My Profile → Authentication

Self-hosted Valohai: Replace https://app.valohai.com with your installation URL

Last updated

Was this helpful?