# Confusion Matrix

Confusion matrices show how your classifier performs across all classes. See where your model excels, where it struggles, and which classes get confused with each other.

***

### Quick Start

#### 1. Log Confusion Matrix Data

Print your confusion matrix as JSON with a specific format:

```python
from sklearn.metrics import confusion_matrix
import json

# Get predictions
y_pred = model.predict(X_test)

# Compute confusion matrix
matrix = confusion_matrix(y_true, y_pred)

# Log to Valohai
print(
    json.dumps(
        {
            "data": matrix.tolist(),
        },
    ),
)
```

**Required format:** `{"data": [[row1], [row2], ...]}`

#### Add labels

If you add a list of string as the first item in the data, those will be used as labels. Make sure the number of labels matches the number of rows / items per row.

```python
print(
    json.dumps(
        {
            "data": [["y_true", "y_pred"], [50, 2], [3, 45]],
        },
    ),
)
```

***

#### 2. View in Metadata Tab

1. Open your execution
2. Click the **Metadata** tab
3. Click the visualization dropdown (shows "Time Series" by default)
4. Select **Confusion Matrix**

The confusion matrix visualization appears automatically.

<figure><img src="/files/O0E8IYirsJOfhaD9cPx7" alt=""><figcaption></figcaption></figure>

***

### Complete Example

#### Binary Classification

```python
from sklearn.metrics import confusion_matrix
import numpy as np
import json

# True labels and predictions
y_true = [0, 1, 0, 1, 0, 1, 1, 0, 1, 0]
y_pred = [0, 1, 0, 0, 0, 1, 1, 1, 1, 0]

# Compute confusion matrix
matrix = confusion_matrix(y_true, y_pred)

# Log to Valohai
print(
    json.dumps(
        {
            "data": matrix.tolist(),
        },
    ),
)

# Output: {"data": [[4, 1], [1, 4]]}
```

**Interpretation:**

* Top-left (4): True negatives
* Top-right (1): False positives
* Bottom-left (1): False negatives
* Bottom-right (4): True positives

***

#### Multi-Class Classification

```python
from sklearn.metrics import confusion_matrix
import json

# Example: 3-class problem
y_true = [2, 0, 2, 2, 0, 1, 1, 2, 2, 0, 1, 2]
y_pred = [0, 0, 2, 1, 0, 2, 1, 0, 2, 0, 2, 2]

# Compute confusion matrix
matrix = confusion_matrix(y_true, y_pred)

# Convert to list and log
result = matrix.tolist()
print(
    json.dumps(
        {
            "data": result,
        },
    ),
)

# Output: {"data": [[3, 0, 0], [0, 1, 2], [2, 1, 3]]}
```

**Interpretation:**

* Rows represent true labels
* Columns represent predicted labels
* Diagonal shows correct predictions
* Off-diagonal shows misclassifications

***

### With PyTorch

```python
import torch
from sklearn.metrics import confusion_matrix
import json

# After training/validation
model.eval()
all_preds = []
all_labels = []

with torch.no_grad():
    for data, labels in test_loader:
        outputs = model(data)
        _, predicted = torch.max(outputs, 1)

        all_preds.extend(predicted.cpu().numpy())
        all_labels.extend(labels.cpu().numpy())

# Compute confusion matrix
matrix = confusion_matrix(all_labels, all_preds)

# Log to Valohai
print(
    json.dumps(
        {
            "data": matrix.tolist(),
        },
    ),
)
```

***

### With TensorFlow/Keras

```python
import tensorflow as tf
from sklearn.metrics import confusion_matrix
import json

# Get predictions
y_pred = model.predict(X_test)
y_pred_classes = tf.argmax(y_pred, axis=1).numpy()
y_true_classes = tf.argmax(y_test, axis=1).numpy()

# Compute confusion matrix
matrix = confusion_matrix(y_true_classes, y_pred_classes)

# Log to Valohai
print(
    json.dumps(
        {
            "data": matrix.tolist(),
        },
    ),
)
```

***

### Troubleshooting

#### Confusion Matrix Not Appearing

**Symptom:** Only time series graphs appear, no confusion matrix option

**Causes & Fixes:**

**Wrong JSON format:**

```python
# Wrong: Missing "data" key
print(json.dumps(matrix.tolist()))

# Correct: Must have "data" key
print(json.dumps({"data": matrix.tolist()}))
```

**Matrix not converted to list:**

```python
# Wrong: NumPy array not JSON serializable
print(json.dumps({"data": matrix}))

# Correct: Convert to list
print(json.dumps({"data": matrix.tolist()}))
```

***

#### Matrix Shows Wrong Values

**Symptom:** Numbers don't match expected confusion matrix

**Cause:** Class labels not aligned

**Solution:** Ensure true and predicted labels use the same encoding:

```python
# If using one-hot encoded labels
y_true_classes = np.argmax(y_true, axis=1)
y_pred_classes = np.argmax(y_pred, axis=1)

matrix = confusion_matrix(y_true_classes, y_pred_classes)
```

***

#### Matrix Shape Changes Between Epochs

**Symptom:** Early epochs have smaller matrices

**Cause:** Not all classes predicted yet

**Solution:** Specify all class labels explicitly:

```python
matrix = confusion_matrix(
    y_true,
    y_pred,
    labels=list(range(num_classes)),  # [0, 1, 2, ..., num_classes-1]
)
```

***

### Next Steps

* [Visualize time series metrics](/experiment-tracking/visualize-metrics/time-series.md) alongside confusion matrices
* [Compare executions](/experiment-tracking/compare-executions.md) to see how different models handle class confusion
* [Compare output images](/experiment-tracking/compare-images.md) for misclassified examples
* Back to [Visualize Metrics overview](/experiment-tracking/visualize-metrics.md)


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.valohai.com/experiment-tracking/visualize-metrics/confusion-matrix.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
