Confusion Matrix¶

Module Interface¶

ConfusionMatrix¶

class torchmetrics.ConfusionMatrix(num_classes, normalize=None, threshold=0.5, multilabel=False, **kwargs)[source]

Note

From v0.10 an ‘binary_*’, ‘multiclass_*’, `’multilabel_*’ version now exist of each classification metric. Moving forward we recommend using these versions. This base metric will still work as it did prior to v0.10 until v0.11. From v0.11 the task argument introduced in this metric will be required and the general order of arguments may change, such that this metric will just function as an single entrypoint to calling the three specialized versions.

Computes the confusion matrix.

Works with binary, multiclass, and multilabel data. Accepts probabilities or logits from a model output or integer class values in prediction. Works with multi-dimensional preds and target, but it should be noted that additional dimensions will be flattened.

Forward accepts

preds (float or long tensor): (N, ...) or (N, C, ...) where C is the number of classes
target (long tensor): (N, ...)

If preds and target are the same shape and preds is a float tensor, we use the self.threshold argument to convert into integer labels. This is the case for binary and multi-label probabilities or logits.

If preds has an extra dimension as in the case of multi-class scores we perform an argmax on dim=1.

If working with multilabel data, setting the is_multilabel argument to True will make sure that a confusion matrix gets calculated per label.

Parameters

num_classes¶ (int) – Number of classes in the dataset.
normalize¶ (Optional[str]) –
Normalization mode for confusion matrix. Choose from:
- None or 'none': no normalization (default)
- 'true': normalization over the targets (most commonly used)
- 'pred': normalization over the predictions
- 'all': normalization over the whole matrix
threshold¶ (float) – Threshold for transforming probability or logit predictions to binary (0,1) predictions, in the case of binary or multi-label inputs. Default value of 0.5 corresponds to input being probabilities.
multilabel¶ (bool) – determines if data is multilabel or not.
kwargs¶ (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Example (binary data):

>>> from torchmetrics import ConfusionMatrix
>>> target = torch.tensor([1, 1, 0, 0])
>>> preds = torch.tensor([0, 1, 0, 0])
>>> confmat = ConfusionMatrix(num_classes=2)
>>> confmat(preds, target)
tensor([[2, 0],
        [1, 1]])

Example (multiclass data):

>>> target = torch.tensor([2, 1, 0, 0])
>>> preds = torch.tensor([2, 1, 0, 1])
>>> confmat = ConfusionMatrix(num_classes=3)
>>> confmat(preds, target)
tensor([[1, 1, 0],
        [0, 1, 0],
        [0, 0, 1]])

Example (multilabel data):

>>> target = torch.tensor([[0, 1, 0], [1, 0, 1]])
>>> preds = torch.tensor([[0, 0, 1], [1, 0, 1]])
>>> confmat = ConfusionMatrix(num_classes=3, multilabel=True)
>>> confmat(preds, target)
tensor([[[1, 0], [0, 1]],
        [[1, 0], [1, 0]],
        [[0, 1], [0, 1]]])

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute()[source]

Computes confusion matrix.

Return type: Tensor
Returns: If multilabel=False this will be a [n_classes, n_classes] tensor and if multilabel=True this will be a [n_classes, 2, 2] tensor.

update(preds, target)[source]

Update state with predictions and targets.

Parameters

preds¶ (Tensor) – Predictions from model
target¶ (Tensor) – Ground truth values

Return type

None

BinaryConfusionMatrix¶

class torchmetrics.classification.BinaryConfusionMatrix(threshold=0.5, ignore_index=None, normalize=None, validate_args=True, **kwargs)[source]

Computes the confusion matrix for binary tasks.

Accepts the following input tensors:

preds (int or float tensor): (N, ...). If preds is a floating point tensor with values outside [0,1] range we consider the input to be logits and will auto apply sigmoid per element. Addtionally, we convert to int tensor with thresholding using the value in threshold.
target (int tensor): (N, ...)

Additional dimension ... will be flattened into the batch dimension.

Parameters

threshold¶ (float) – Threshold for transforming probability to binary (0,1) predictions
ignore_index¶ (Optional[int]) – Specifies a target value that is ignored and does not contribute to the metric calculation
normalize¶ (Optional[Literal[‘true’, ‘pred’, ‘all’, ‘none’]]) –
Normalization mode for confusion matrix. Choose from:
- None or 'none': no normalization (default)
- 'true': normalization over the targets (most commonly used)
- 'pred': normalization over the predictions
- 'all': normalization over the whole matrix
validate_args¶ (bool) – bool indicating if input arguments and tensors should be validated for correctness. Set to False for faster computations.
kwargs¶ (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Example (preds is int tensor):

>>> from torchmetrics.classification import BinaryConfusionMatrix
>>> target = torch.tensor([1, 1, 0, 0])
>>> preds = torch.tensor([0, 1, 0, 0])
>>> metric = BinaryConfusionMatrix()
>>> metric(preds, target)
tensor([[2, 0],
        [1, 1]])

Example (preds is float tensor):

>>> from torchmetrics.classification import BinaryConfusionMatrix
>>> target = torch.tensor([1, 1, 0, 0])
>>> preds = torch.tensor([0.35, 0.85, 0.48, 0.01])
>>> metric = BinaryConfusionMatrix()
>>> metric(preds, target)
tensor([[2, 0],
        [1, 1]])

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute()[source]

Computes confusion matrix.

Returns an [2,2] matrix.

Return type: Tensor

update(preds, target)[source]

Update state with predictions and targets.

Parameters

preds¶ (Tensor) – Tensor with predictions
target¶ (Tensor) – Tensor with true labels

Return type

None

MulticlassConfusionMatrix¶

class torchmetrics.classification.MulticlassConfusionMatrix(num_classes, ignore_index=None, normalize=None, validate_args=True, **kwargs)[source]

Computes the confusion matrix for multiclass tasks.

Accepts the following input tensors:

preds: (N, ...) (int tensor) or (N, C, ..) (float tensor). If preds is a floating point we apply torch.argmax along the C dimension to automatically convert probabilities/logits into an int tensor.
target (int tensor): (N, ...)

Additional dimension ... will be flattened into the batch dimension.

Parameters

num_classes¶ (int) – Integer specifing the number of classes
ignore_index¶ (Optional[int]) – Specifies a target value that is ignored and does not contribute to the metric calculation
normalize¶ (Optional[Literal[‘none’, ‘true’, ‘pred’, ‘all’]]) –
Normalization mode for confusion matrix. Choose from:
- None or 'none': no normalization (default)
- 'true': normalization over the targets (most commonly used)
- 'pred': normalization over the predictions
- 'all': normalization over the whole matrix
validate_args¶ (bool) – bool indicating if input arguments and tensors should be validated for correctness. Set to False for faster computations.
kwargs¶ (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Example (pred is integer tensor):

>>> from torchmetrics.classification import MulticlassConfusionMatrix
>>> target = torch.tensor([2, 1, 0, 0])
>>> preds = torch.tensor([2, 1, 0, 1])
>>> metric = MulticlassConfusionMatrix(num_classes=3)
>>> metric(preds, target)
tensor([[1, 1, 0],
        [0, 1, 0],
        [0, 0, 1]])

Example (pred is float tensor):

>>> from torchmetrics.classification import MulticlassConfusionMatrix
>>> target = torch.tensor([2, 1, 0, 0])
>>> preds = torch.tensor([
...   [0.16, 0.26, 0.58],
...   [0.22, 0.61, 0.17],
...   [0.71, 0.09, 0.20],
...   [0.05, 0.82, 0.13],
... ])
>>> metric = MulticlassConfusionMatrix(num_classes=3)
>>> metric(preds, target)
tensor([[1, 1, 0],
        [0, 1, 0],
        [0, 0, 1]])

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute()[source]

Computes confusion matrix.

Returns an [num_classes, num_classes] matrix.

Return type: Tensor

update(preds, target)[source]

Update state with predictions and targets.

Parameters

preds¶ (Tensor) – Tensor with predictions
target¶ (Tensor) – Tensor with true labels

Return type

None

MultilabelConfusionMatrix¶

class torchmetrics.classification.MultilabelConfusionMatrix(num_labels, threshold=0.5, ignore_index=None, normalize=None, validate_args=True, **kwargs)[source]

Computes the confusion matrix for multilabel tasks.

Accepts the following input tensors:

preds (int or float tensor): (N, C, ...). If preds is a floating point tensor with values outside [0,1] range we consider the input to be logits and will auto apply sigmoid per element. Addtionally, we convert to int tensor with thresholding using the value in threshold.
target (int tensor): (N, C, ...)

Additional dimension ... will be flattened into the batch dimension.

Parameters

num_classes¶ – Integer specifing the number of labels
threshold¶ (float) – Threshold for transforming probability to binary (0,1) predictions
ignore_index¶ (Optional[int]) – Specifies a target value that is ignored and does not contribute to the metric calculation
normalize¶ (Optional[Literal[‘none’, ‘true’, ‘pred’, ‘all’]]) –
Normalization mode for confusion matrix. Choose from:
- None or 'none': no normalization (default)
- 'true': normalization over the targets (most commonly used)
- 'pred': normalization over the predictions
- 'all': normalization over the whole matrix
validate_args¶ (bool) – bool indicating if input arguments and tensors should be validated for correctness. Set to False for faster computations.
kwargs¶ (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Example (preds is int tensor):

>>> from torchmetrics.classification import MultilabelConfusionMatrix
>>> target = torch.tensor([[0, 1, 0], [1, 0, 1]])
>>> preds = torch.tensor([[0, 0, 1], [1, 0, 1]])
>>> metric = MultilabelConfusionMatrix(num_labels=3)
>>> metric(preds, target)
tensor([[[1, 0], [0, 1]],
        [[1, 0], [1, 0]],
        [[0, 1], [0, 1]]])

Example (preds is float tensor):

>>> from torchmetrics.classification import MultilabelConfusionMatrix
>>> target = torch.tensor([[0, 1, 0], [1, 0, 1]])
>>> preds = torch.tensor([[0.11, 0.22, 0.84], [0.73, 0.33, 0.92]])
>>> metric = MultilabelConfusionMatrix(num_labels=3)
>>> metric(preds, target)
tensor([[[1, 0], [0, 1]],
        [[1, 0], [1, 0]],
        [[0, 1], [0, 1]]])

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute()[source]

Computes confusion matrix.

Returns an [num_labels,2,2] matrix.

Return type: Tensor

update(preds, target)[source]

Update state with predictions and targets.

Parameters

preds¶ (Tensor) – Tensor with predictions
target¶ (Tensor) – Tensor with true labels

Return type

None

Functional Interface¶

confusion_matrix¶

torchmetrics.functional.confusion_matrix(preds, target, num_classes, normalize=None, threshold=0.5, multilabel=False, task=None, num_labels=None, ignore_index=None, validate_args=True)[source]

Note

From v0.10 an ‘binary_*’, ‘multiclass_*’, `’multilabel_*’ version now exist of each classification metric. Moving forward we recommend using these versions. This base metric will still work as it did prior to v0.10 until v0.11. From v0.11 the task argument introduced in this metric will be required and the general order of arguments may change, such that this metric will just function as an single entrypoint to calling the three specialized versions.

Computes the confusion matrix. Works with binary, multiclass, and multilabel data. Accepts probabilities or logits from a model output or integer class values in prediction. Works with multi-dimensional preds and target, but it should be noted that additional dimensions will be flattened.

If preds and target are the same shape and preds is a float tensor, we use the self.threshold argument to convert into integer labels. This is the case for binary and multi-label probabilities or logits.

If preds has an extra dimension as in the case of multi-class scores we perform an argmax on dim=1.

If working with multilabel data, setting the is_multilabel argument to True will make sure that a confusion matrix gets calculated per label.

Parameters

preds¶ (Tensor) – (float or long tensor), Either a (N, ...) tensor with labels or (N, C, ...) where C is the number of classes, tensor with labels/logits/probabilities
target¶ (Tensor) – target (long tensor), tensor with shape (N, ...) with ground true labels
num_classes¶ (int) – Number of classes in the dataset.
normalize¶ (Optional[Literal[‘true’, ‘pred’, ‘all’, ‘none’]]) –
Normalization mode for confusion matrix. Choose from:
- None or 'none': no normalization (default)
- 'true': normalization over the targets (most commonly used)
- 'pred': normalization over the predictions
- 'all': normalization over the whole matrix
threshold¶ (float) – Threshold for transforming probability or logit predictions to binary (0,1) predictions, in the case of binary or multi-label inputs. Default value of 0.5 corresponds to input being probabilities.
multilabel¶ (bool) – determines if data is multilabel or not.

Example (binary data):

>>> from torchmetrics import ConfusionMatrix
>>> target = torch.tensor([1, 1, 0, 0])
>>> preds = torch.tensor([0, 1, 0, 0])
>>> confmat = ConfusionMatrix(num_classes=2)
>>> confmat(preds, target)
tensor([[2, 0],
        [1, 1]])

Example (multiclass data):

>>> target = torch.tensor([2, 1, 0, 0])
>>> preds = torch.tensor([2, 1, 0, 1])
>>> confmat = ConfusionMatrix(num_classes=3)
>>> confmat(preds, target)
tensor([[1, 1, 0],
        [0, 1, 0],
        [0, 0, 1]])

Example (multilabel data):

>>> target = torch.tensor([[0, 1, 0], [1, 0, 1]])
>>> preds = torch.tensor([[0, 0, 1], [1, 0, 1]])
>>> confmat = ConfusionMatrix(num_classes=3, multilabel=True)
>>> confmat(preds, target)
tensor([[[1, 0], [0, 1]],
        [[1, 0], [1, 0]],
        [[0, 1], [0, 1]]])

Return type: Tensor

binary_confusion_matrix¶

torchmetrics.functional.classification.binary_confusion_matrix(preds, target, threshold=0.5, normalize=None, ignore_index=None, validate_args=True)[source]

Computes the confusion matrix for binary tasks.

Accepts the following input tensors:

preds (int or float tensor): (N, ...). If preds is a floating point tensor with values outside [0,1] range we consider the input to be logits and will auto apply sigmoid per element. Addtionally, we convert to int tensor with thresholding using the value in threshold.
target (int tensor): (N, ...)

Additional dimension ... will be flattened into the batch dimension.

Parameters

preds¶ (Tensor) – Tensor with predictions
target¶ (Tensor) – Tensor with true labels
threshold¶ (float) – Threshold for transforming probability to binary (0,1) predictions
normalize¶ (Optional[Literal[‘true’, ‘pred’, ‘all’, ‘none’]]) –
Normalization mode for confusion matrix. Choose from:
- None or 'none': no normalization (default)
- 'true': normalization over the targets (most commonly used)
- 'pred': normalization over the predictions
- 'all': normalization over the whole matrix
ignore_index¶ (Optional[int]) – Specifies a target value that is ignored and does not contribute to the metric calculation
validate_args¶ (bool) – bool indicating if input arguments and tensors should be validated for correctness. Set to False for faster computations.

Return type

Tensor

Returns

A [2, 2] tensor

Example (preds is int tensor):

>>> from torchmetrics.functional.classification import binary_confusion_matrix
>>> target = torch.tensor([1, 1, 0, 0])
>>> preds = torch.tensor([0, 1, 0, 0])
>>> binary_confusion_matrix(preds, target)
tensor([[2, 0],
        [1, 1]])

Example (preds is float tensor):

>>> from torchmetrics.functional.classification import binary_confusion_matrix
>>> target = torch.tensor([1, 1, 0, 0])
>>> preds = torch.tensor([0.35, 0.85, 0.48, 0.01])
>>> binary_confusion_matrix(preds, target)
tensor([[2, 0],
        [1, 1]])

multiclass_confusion_matrix¶

torchmetrics.functional.classification.multiclass_confusion_matrix(preds, target, num_classes, normalize=None, ignore_index=None, validate_args=True)[source]

Computes the confusion matrix for multiclass tasks.

Accepts the following input tensors:

preds: (N, ...) (int tensor) or (N, C, ..) (float tensor). If preds is a floating point we apply torch.argmax along the C dimension to automatically convert probabilities/logits into an int tensor.
target (int tensor): (N, ...)

Additional dimension ... will be flattened into the batch dimension.

Parameters

preds¶ (Tensor) – Tensor with predictions
target¶ (Tensor) – Tensor with true labels
num_classes¶ (int) – Integer specifing the number of classes
normalize¶ (Optional[Literal[‘true’, ‘pred’, ‘all’, ‘none’]]) –
Normalization mode for confusion matrix. Choose from:
- None or 'none': no normalization (default)
- 'true': normalization over the targets (most commonly used)
- 'pred': normalization over the predictions
- 'all': normalization over the whole matrix
ignore_index¶ (Optional[int]) – Specifies a target value that is ignored and does not contribute to the metric calculation
validate_args¶ (bool) – bool indicating if input arguments and tensors should be validated for correctness. Set to False for faster computations.

Return type

Tensor

Returns

A [num_classes, num_classes] tensor

Example (pred is integer tensor):

>>> from torchmetrics.functional.classification import multiclass_confusion_matrix
>>> target = torch.tensor([2, 1, 0, 0])
>>> preds = torch.tensor([2, 1, 0, 1])
>>> multiclass_confusion_matrix(preds, target, num_classes=3)
tensor([[1, 1, 0],
        [0, 1, 0],
        [0, 0, 1]])

Example (pred is float tensor):

>>> from torchmetrics.functional.classification import multiclass_confusion_matrix
>>> target = torch.tensor([2, 1, 0, 0])
>>> preds = torch.tensor([
...   [0.16, 0.26, 0.58],
...   [0.22, 0.61, 0.17],
...   [0.71, 0.09, 0.20],
...   [0.05, 0.82, 0.13],
... ])
>>> multiclass_confusion_matrix(preds, target, num_classes=3)
tensor([[1, 1, 0],
        [0, 1, 0],
        [0, 0, 1]])

multilabel_confusion_matrix¶

torchmetrics.functional.classification.multilabel_confusion_matrix(preds, target, num_labels, threshold=0.5, normalize=None, ignore_index=None, validate_args=True)[source]

Computes the confusion matrix for multilabel tasks.

Accepts the following input tensors:

preds (int or float tensor): (N, C, ...). If preds is a floating point tensor with values outside [0,1] range we consider the input to be logits and will auto apply sigmoid per element. Addtionally, we convert to int tensor with thresholding using the value in threshold.
target (int tensor): (N, C, ...)

Additional dimension ... will be flattened into the batch dimension.

Parameters

preds¶ (Tensor) – Tensor with predictions
target¶ (Tensor) – Tensor with true labels
num_labels¶ (int) – Integer specifing the number of labels
threshold¶ (float) – Threshold for transforming probability to binary (0,1) predictions
normalize¶ (Optional[Literal[‘true’, ‘pred’, ‘all’, ‘none’]]) –
Normalization mode for confusion matrix. Choose from:
- None or 'none': no normalization (default)
- 'true': normalization over the targets (most commonly used)
- 'pred': normalization over the predictions
- 'all': normalization over the whole matrix
ignore_index¶ (Optional[int]) – Specifies a target value that is ignored and does not contribute to the metric calculation
validate_args¶ (bool) – bool indicating if input arguments and tensors should be validated for correctness. Set to False for faster computations.

Return type

Tensor

Returns

A [num_labels, 2, 2] tensor

Example (preds is int tensor):

>>> from torchmetrics.functional.classification import multilabel_confusion_matrix
>>> target = torch.tensor([[0, 1, 0], [1, 0, 1]])
>>> preds = torch.tensor([[0, 0, 1], [1, 0, 1]])
>>> multilabel_confusion_matrix(preds, target, num_labels=3)
tensor([[[1, 0], [0, 1]],
        [[1, 0], [1, 0]],
        [[0, 1], [0, 1]]])

Example (preds is float tensor):

>>> from torchmetrics.functional.classification import multilabel_confusion_matrix
>>> target = torch.tensor([[0, 1, 0], [1, 0, 1]])
>>> preds = torch.tensor([[0.11, 0.22, 0.84], [0.73, 0.33, 0.92]])
>>> multilabel_confusion_matrix(preds, target, num_labels=3)
tensor([[[1, 0], [0, 1]],
        [[1, 0], [1, 0]],
        [[0, 1], [0, 1]]])