Accuracy¶

Module Interface¶

class torchmetrics.Accuracy(threshold=0.5, num_classes=None, average='micro', mdmc_average=None, ignore_index=None, top_k=None, multiclass=None, subset_accuracy=False, **kwargs)[source]

Computes Accuracy:

$\text{Accuracy} = \frac{1}{N}\sum_i^N 1(y_i = \hat{y}_i)$

Where $y$ is a tensor of target values, and $\hat{y}$ is a tensor of predictions.

For multi-class and multi-dimensional multi-class data with probability or logits predictions, the parameter top_k generalizes this metric to a Top-K accuracy metric: for each sample the top-K highest probability or logit score items are considered to find the correct label.

For multi-label and multi-dimensional multi-class inputs, this metric computes the “global” accuracy by default, which counts all labels or sub-samples separately. This can be changed to subset accuracy (which requires all labels or sub-samples in the sample to be correctly predicted) by setting subset_accuracy=True.

Accepts all input types listed in Input types.

Parameters

num_classes¶ (Optional[int]) – Number of classes. Necessary for 'macro', 'weighted' and None average methods.
threshold¶ (float) – Threshold for transforming probability or logit predictions to binary (0,1) predictions, in the case of binary or multi-label inputs. Default value of 0.5 corresponds to input being probabilities.
average¶ (Optional[str]) –
Defines the reduction that is applied. Should be one of the following:
- 'micro' [default]: Calculate the metric globally, across all samples and classes.
- 'macro': Calculate the metric for each class separately, and average the metrics across classes (with equal weights for each class).
- 'weighted': Calculate the metric for each class separately, and average the metrics across classes, weighting each class by its support (tp + fn).
- 'none' or None: Calculate the metric for each class separately, and return the metric for every class.
- 'samples': Calculate the metric for each sample, and average the metrics across samples (with equal weights for each sample).
Note

What is considered a sample in the multi-dimensional multi-class case depends on the value of mdmc_average.

Note

If 'none' and a given class doesn’t occur in the preds or target, the value for the class will be nan.
mdmc_average¶ (Optional[str]) –
Defines how averaging is done for multi-dimensional multi-class inputs (on top of the average parameter). Should be one of the following:
- None [default]: Should be left unchanged if your data is not multi-dimensional multi-class.
- 'samplewise': In this case, the statistics are computed separately for each sample on the N axis, and then averaged over samples. The computation for each sample is done by treating the flattened extra axes ... (see Input types) as the N dimension within the sample, and computing the metric for the sample based on that.
- 'global': In this case the N and ... dimensions of the inputs (see Input types) are flattened into a new N_X sample axis, i.e. the inputs are treated as if they were (N_X, C). From here on the average parameter applies as usual.
ignore_index¶ (Optional[int]) – Integer specifying a target class to ignore. If given, this class index does not contribute to the returned score, regardless of reduction method. If an index is ignored, and average=None or 'none', the score for the ignored class will be returned as nan.
top_k¶ (Optional[int]) –
Number of the highest probability or logit score predictions considered finding the correct label, relevant only for (multi-dimensional) multi-class inputs. The default value (None) will be interpreted as 1 for these inputs.

Should be left at default (None) for all other types of inputs.
multiclass¶ (Optional[bool]) – Used only in certain special cases, where you want to treat inputs as a different type than what they appear to be. See the parameter’s documentation section for a more detailed explanation and examples.
subset_accuracy¶ (bool) –
Whether to compute subset accuracy for multi-label and multi-dimensional multi-class inputs (has no effect for other input types).
- For multi-label inputs, if the parameter is set to True, then all labels for each sample must be correctly predicted for the sample to count as correct. If it is set to False, then all labels are counted separately - this is equivalent to flattening inputs beforehand (i.e. preds = preds.flatten() and same for target).
- For multi-dimensional multi-class inputs, if the parameter is set to True, then all sub-sample (on the extra axis) must be correct for the sample to be counted as correct. If it is set to False, then all sub-samples are counter separately - this is equivalent, in the case of label predictions, to flattening the inputs beforehand (i.e. preds = preds.flatten() and same for target). Note that the top_k parameter still applies in both cases, if set.
kwargs¶ (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Raises

ValueError – If top_k is not an integer larger than 0.
ValueError – If average is none of "micro", "macro", "weighted", "samples", "none", None.
ValueError – If two different input modes are provided, eg. using multi-label with multi-class.
ValueError – If top_k parameter is set for multi-label inputs.

Example

>>> import torch
>>> from torchmetrics import Accuracy
>>> target = torch.tensor([0, 1, 2, 3])
>>> preds = torch.tensor([0, 2, 1, 3])
>>> accuracy = Accuracy()
>>> accuracy(preds, target)
tensor(0.5000)

>>> target = torch.tensor([0, 1, 2])
>>> preds = torch.tensor([[0.1, 0.9, 0], [0.3, 0.1, 0.6], [0.2, 0.5, 0.3]])
>>> accuracy = Accuracy(top_k=2)
>>> accuracy(preds, target)
tensor(0.6667)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute()[source]

Computes accuracy based on inputs passed in to update previously.

Return type: Tensor

update(preds, target)[source]

Update state with predictions and targets. See Input types for more information on input types.

Parameters

preds¶ (Tensor) – Predictions from model (logits, probabilities, or labels)
target¶ (Tensor) – Ground truth labels

Return type

None

Functional Interface¶

torchmetrics.functional.accuracy(preds, target, average='micro', mdmc_average='global', threshold=0.5, top_k=None, subset_accuracy=False, num_classes=None, multiclass=None, ignore_index=None)[source]

Computes Accuracy

$\text{Accuracy} = \frac{1}{N}\sum_i^N 1(y_i = \hat{y}_i)$

Where $y$ is a tensor of target values, and $\hat{y}$ is a tensor of predictions.

For multi-class and multi-dimensional multi-class data with probability or logits predictions, the parameter top_k generalizes this metric to a Top-K accuracy metric: for each sample the top-K highest probability or logits items are considered to find the correct label.

For multi-label and multi-dimensional multi-class inputs, this metric computes the “global” accuracy by default, which counts all labels or sub-samples separately. This can be changed to subset accuracy (which requires all labels or sub-samples in the sample to be correctly predicted) by setting subset_accuracy=True.

Accepts all input types listed in Input types.

Parameters

preds¶ (Tensor) – Predictions from model (probabilities, logits or labels)
target¶ (Tensor) – Ground truth labels
average¶ (Optional[str]) –
Defines the reduction that is applied. Should be one of the following:
- 'micro' [default]: Calculate the metric globally, across all samples and classes.
- 'macro': Calculate the metric for each class separately, and average the metrics across classes (with equal weights for each class).
- 'weighted': Calculate the metric for each class separately, and average the metrics across classes, weighting each class by its support (tp + fn).
- 'none' or None: Calculate the metric for each class separately, and return the metric for every class.
- 'samples': Calculate the metric for each sample, and average the metrics across samples (with equal weights for each sample).
Note

What is considered a sample in the multi-dimensional multi-class case depends on the value of mdmc_average.

Note

If 'none' and a given class doesn’t occur in the preds or target, the value for the class will be nan.
mdmc_average¶ (Optional[str]) –
Defines how averaging is done for multi-dimensional multi-class inputs (on top of the average parameter). Should be one of the following:
- None [default]: Should be left unchanged if your data is not multi-dimensional multi-class.
- 'samplewise': In this case, the statistics are computed separately for each sample on the N axis, and then averaged over samples. The computation for each sample is done by treating the flattened extra axes ... (see Input types) as the N dimension within the sample, and computing the metric for the sample based on that.
- 'global': In this case the N and ... dimensions of the inputs (see Input types) are flattened into a new N_X sample axis, i.e. the inputs are treated as if they were (N_X, C). From here on the average parameter applies as usual.
num_classes¶ (Optional[int]) – Number of classes. Necessary for 'macro', 'weighted' and None average methods.
threshold¶ (float) – Threshold for transforming probability or logit predictions to binary (0,1) predictions, in the case of binary or multi-label inputs. Default value of 0.5 corresponds to input being probabilities.
top_k¶ (Optional[int]) –
Number of the highest probability or logit score predictions considered finding the correct label, relevant only for (multi-dimensional) multi-class inputs. The default value (None) will be interpreted as 1 for these inputs.

Should be left at default (None) for all other types of inputs.
multiclass¶ (Optional[bool]) – Used only in certain special cases, where you want to treat inputs as a different type than what they appear to be. See the parameter’s documentation section for a more detailed explanation and examples.
ignore_index¶ (Optional[int]) – Integer specifying a target class to ignore. If given, this class index does not contribute to the returned score, regardless of reduction method. If an index is ignored, and average=None or 'none', the score for the ignored class will be returned as nan.
subset_accuracy¶ (bool) –
Whether to compute subset accuracy for multi-label and multi-dimensional multi-class inputs (has no effect for other input types).
- For multi-label inputs, if the parameter is set to True, then all labels for each sample must be correctly predicted for the sample to count as correct. If it is set to False, then all labels are counted separately - this is equivalent to flattening inputs beforehand (i.e. preds = preds.flatten() and same for target).
- For multi-dimensional multi-class inputs, if the parameter is set to True, then all sub-sample (on the extra axis) must be correct for the sample to be counted as correct. If it is set to False, then all sub-samples are counter separately - this is equivalent, in the case of label predictions, to flattening the inputs beforehand (i.e. preds = preds.flatten() and same for target). Note that the top_k parameter still applies in both cases, if set.

Raises

ValueError – If top_k parameter is set for multi-label inputs.
ValueError – If average is none of "micro", "macro", "weighted", "samples", "none", None.
ValueError – If mdmc_average is not one of None, "samplewise", "global".
ValueError – If average is set but num_classes is not provided.
ValueError – If num_classes is set and ignore_index is not in the range [0, num_classes).
ValueError – If top_k is not an integer larger than 0.

Example

>>> import torch
>>> from torchmetrics.functional import accuracy
>>> target = torch.tensor([0, 1, 2, 3])
>>> preds = torch.tensor([0, 2, 1, 3])
>>> accuracy(preds, target)
tensor(0.5000)

>>> target = torch.tensor([0, 1, 2])
>>> preds = torch.tensor([[0.1, 0.9, 0], [0.3, 0.1, 0.6], [0.2, 0.5, 0.3]])
>>> accuracy(preds, target, top_k=2)
tensor(0.6667)

Return type: Tensor