Scale-Invariant Signal-to-Distortion Ratio (SI-SDR)¶

Module Interface¶

class torchmetrics.ScaleInvariantSignalDistortionRatio(zero_mean=False, **kwargs)[source]

Scale-invariant signal-to-distortion ratio (SI-SDR). The SI-SDR value is in general considered an overall measure of how good a source sound.

As input to forward and update the metric accepts the following input

preds (Tensor): float tensor with shape (...,time)
target (: Tensor): float tensor with shape (...,time)

As output of forward and compute the metric returns the following output

si_sdr (: Tensor): float scalar tensor with average SI-SDR value over samples

Parameters

zero_mean¶ (bool) – if to zero mean target and preds or not
kwargs¶ (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Raises

TypeError – if target and preds have a different shape

Example

>>> import torch
>>> from torchmetrics import ScaleInvariantSignalDistortionRatio
>>> target = torch.tensor([3.0, -0.5, 2.0, 7.0])
>>> preds = torch.tensor([2.5, 0.0, 2.0, 8.0])
>>> si_sdr = ScaleInvariantSignalDistortionRatio()
>>> si_sdr(preds, target)
tensor(18.4030)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Functional Interface¶

torchmetrics.functional.scale_invariant_signal_distortion_ratio(preds, target, zero_mean=False)[source]

Scale-invariant signal-to-distortion ratio (SI-SDR). The SI-SDR value is in general considered an overall measure of how good a source sound.

Parameters

preds¶ (Tensor) – float tensor with shape (...,time)
target¶ (Tensor) – float tensor with shape (...,time)
zero_mean¶ (bool) – If to zero mean target and preds or not

Return type

Tensor

Returns

Float tensor with shape (...,) of SDR values per sample

Raises

RuntimeError – If preds and target does not have the same shape

Example

>>> from torchmetrics.functional.audio import scale_invariant_signal_distortion_ratio
>>> target = torch.tensor([3.0, -0.5, 2.0, 7.0])
>>> preds = torch.tensor([2.5, 0.0, 2.0, 8.0])
>>> scale_invariant_signal_distortion_ratio(preds, target)
tensor(18.4030)