Confusion Matrix

Confusion matrices show how your classifier performs across all classes. See where your model excels, where it struggles, and which classes get confused with each other.

Quick Start

1. Log Confusion Matrix Data

Print your confusion matrix as JSON with a specific format:

from sklearn.metrics import confusion_matrix
import json

# Get predictions
y_pred = model.predict(X_test)

# Compute confusion matrix
matrix = confusion_matrix(y_true, y_pred)

# Log to Valohai
print(
    json.dumps(
        {
            "data": matrix.tolist(),
        },
    ),
)

Required format: {"data": [[row1], [row2], ...]}

Add labels

If you add a list of string as the first item in the data, those will be used as labels. Make sure the number of labels matches the number of rows / items per row.

print(
    json.dumps(
        {
            "data": [["y_true", "y_pred"], [50, 2], [3, 45]],
        },
    ),
)

2. View in Metadata Tab

Open your execution
Click the Metadata tab
Click the visualization dropdown (shows "Time Series" by default)
Select Confusion Matrix

The confusion matrix visualization appears automatically.

Complete Example

Binary Classification

from sklearn.metrics import confusion_matrix
import numpy as np
import json

# True labels and predictions
y_true = [0, 1, 0, 1, 0, 1, 1, 0, 1, 0]
y_pred = [0, 1, 0, 0, 0, 1, 1, 1, 1, 0]

# Compute confusion matrix
matrix = confusion_matrix(y_true, y_pred)

# Log to Valohai
print(
    json.dumps(
        {
            "data": matrix.tolist(),
        },
    ),
)

# Output: {"data": [[4, 1], [1, 4]]}

Interpretation:

Top-left (4): True negatives
Top-right (1): False positives
Bottom-left (1): False negatives
Bottom-right (4): True positives

Multi-Class Classification

from sklearn.metrics import confusion_matrix
import json

# Example: 3-class problem
y_true = [2, 0, 2, 2, 0, 1, 1, 2, 2, 0, 1, 2]
y_pred = [0, 0, 2, 1, 0, 2, 1, 0, 2, 0, 2, 2]

# Compute confusion matrix
matrix = confusion_matrix(y_true, y_pred)

# Convert to list and log
result = matrix.tolist()
print(
    json.dumps(
        {
            "data": result,
        },
    ),
)

# Output: {"data": [[3, 0, 0], [0, 1, 2], [2, 1, 3]]}

Interpretation:

Rows represent true labels
Columns represent predicted labels
Diagonal shows correct predictions
Off-diagonal shows misclassifications

With PyTorch

import torch
from sklearn.metrics import confusion_matrix
import json

# After training/validation
model.eval()
all_preds = []
all_labels = []

with torch.no_grad():
    for data, labels in test_loader:
        outputs = model(data)
        _, predicted = torch.max(outputs, 1)

        all_preds.extend(predicted.cpu().numpy())
        all_labels.extend(labels.cpu().numpy())

# Compute confusion matrix
matrix = confusion_matrix(all_labels, all_preds)

# Log to Valohai
print(
    json.dumps(
        {
            "data": matrix.tolist(),
        },
    ),
)

With TensorFlow/Keras

import tensorflow as tf
from sklearn.metrics import confusion_matrix
import json

# Get predictions
y_pred = model.predict(X_test)
y_pred_classes = tf.argmax(y_pred, axis=1).numpy()
y_true_classes = tf.argmax(y_test, axis=1).numpy()

# Compute confusion matrix
matrix = confusion_matrix(y_true_classes, y_pred_classes)

# Log to Valohai
print(
    json.dumps(
        {
            "data": matrix.tolist(),
        },
    ),
)

Troubleshooting

Confusion Matrix Not Appearing

Symptom: Only time series graphs appear, no confusion matrix option

Causes & Fixes:

Wrong JSON format:

# Wrong: Missing "data" key
print(json.dumps(matrix.tolist()))

# Correct: Must have "data" key
print(json.dumps({"data": matrix.tolist()}))

Matrix not converted to list:

# Wrong: NumPy array not JSON serializable
print(json.dumps({"data": matrix}))

# Correct: Convert to list
print(json.dumps({"data": matrix.tolist()}))

Matrix Shows Wrong Values

Symptom: Numbers don't match expected confusion matrix

Cause: Class labels not aligned

Solution: Ensure true and predicted labels use the same encoding:

# If using one-hot encoded labels
y_true_classes = np.argmax(y_true, axis=1)
y_pred_classes = np.argmax(y_pred, axis=1)

matrix = confusion_matrix(y_true_classes, y_pred_classes)

Matrix Shape Changes Between Epochs

Symptom: Early epochs have smaller matrices

Cause: Not all classes predicted yet

Solution: Specify all class labels explicitly:

matrix = confusion_matrix(
    y_true,
    y_pred,
    labels=list(range(num_classes)),  # [0, 1, 2, ..., num_classes-1]
)

Next Steps

Visualize time series metrics alongside confusion matrices
Compare executions to see how different models handle class confusion
Compare output images for misclassified examples
Back to Visualize Metrics overview

PreviousTime Series NextCompare Executions

Last updated 1 month ago

Was this helpful?