Scale-Invariant Signal-to-Distortion Ratio (SI-SDR)¶

Module Interface¶

class torchmetrics.audio.ScaleInvariantSignalDistortionRatio(zero_mean=False, **kwargs)[source]¶

Scale-invariant signal-to-distortion ratio (SI-SDR).

The SI-SDR value is in general considered an overall measure of how good a source sound.

As input to forward and update the metric accepts the following input

preds (Tensor): float tensor with shape (...,time)
target (Tensor): float tensor with shape (...,time)

As output of forward and compute the metric returns the following output

si_sdr (Tensor): float scalar tensor with average SI-SDR value over samples

Parameters:

zero_mean¶ (bool) – if to zero mean target and preds or not
kwargs¶ (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Raises:

TypeError – if target and preds have a different shape

Example

>>> from torch import tensor
>>> from torchmetrics.audio import ScaleInvariantSignalDistortionRatio
>>> target = tensor([3.0, -0.5, 2.0, 7.0])
>>> preds = tensor([2.5, 0.0, 2.0, 8.0])
>>> si_sdr = ScaleInvariantSignalDistortionRatio()
>>> si_sdr(preds, target)
tensor(18.4030)

plot(val=None, ax=None)[source]¶

Plot a single or multiple values from the metric.

Parameters:

val¶ (Union[Tensor, Sequence[Tensor], None]) – Either a single result from calling metric.forward or metric.compute or a list of these results. If no value is provided, will automatically call metric.compute and plot that result.
ax¶ (Optional[Axes]) – An matplotlib axis object. If provided will add plot to that axis

Return type:

Tuple[Figure, Union[Axes, ndarray]]

Returns:

Figure and Axes object

Raises:

ModuleNotFoundError – If matplotlib is not installed

>>> # Example plotting a single value
>>> import torch
>>> from torchmetrics.audio import ScaleInvariantSignalDistortionRatio
>>> target = torch.randn(5)
>>> preds = torch.randn(5)
>>> metric = ScaleInvariantSignalDistortionRatio()
>>> metric.update(preds, target)
>>> fig_, ax_ = metric.plot()

../_images/scale_invariant_signal_distortion_ratio-1.png

>>> # Example plotting multiple values
>>> import torch
>>> from torchmetrics.audio import ScaleInvariantSignalDistortionRatio
>>> target = torch.randn(5)
>>> preds = torch.randn(5)
>>> metric = ScaleInvariantSignalDistortionRatio()
>>> values = [ ]
>>> for _ in range(10):
...     values.append(metric(preds, target))
>>> fig_, ax_ = metric.plot(values)

../_images/scale_invariant_signal_distortion_ratio-2.png

Functional Interface¶

torchmetrics.functional.audio.scale_invariant_signal_distortion_ratio(preds, target, zero_mean=False)[source]¶

Scale-invariant signal-to-distortion ratio (SI-SDR).

The SI-SDR value is in general considered an overall measure of how good a source sound.

Parameters:

preds¶ (Tensor) – float tensor with shape (...,time)
target¶ (Tensor) – float tensor with shape (...,time)
zero_mean¶ (bool) – If to zero mean target and preds or not

Return type:

Tensor

Returns:

Float tensor with shape (...,) of SDR values per sample

Raises:

RuntimeError – If preds and target does not have the same shape

Example

>>> from torchmetrics.functional.audio import scale_invariant_signal_distortion_ratio
>>> target = torch.tensor([3.0, -0.5, 2.0, 7.0])
>>> preds = torch.tensor([2.5, 0.0, 2.0, 8.0])
>>> scale_invariant_signal_distortion_ratio(preds, target)
tensor(18.4030)