Scale-Invariant Signal-to-Distortion Ratio (SI-SDR)¶

Module Interface¶

class torchmetrics.ScaleInvariantSignalDistortionRatio(zero_mean=False, **kwargs)[source]

Scale-invariant signal-to-distortion ratio (SI-SDR). The SI-SDR value is in general considered an overall measure of how good a source sound.

Forward accepts

preds: shape [...,time]
target: shape [...,time]

Parameters

zero_mean¶ (bool) – if to zero mean target and preds or not
kwargs¶ (Dict[str, Any]) – Additional keyword arguments, see Advanced metric settings for more info.

Raises

TypeError – if target and preds have a different shape

Returns

average si-sdr value

Example

>>> import torch
>>> from torchmetrics import ScaleInvariantSignalDistortionRatio
>>> target = torch.tensor([3.0, -0.5, 2.0, 7.0])
>>> preds = torch.tensor([2.5, 0.0, 2.0, 8.0])
>>> si_sdr = ScaleInvariantSignalDistortionRatio()
>>> si_sdr(preds, target)
tensor(18.4030)

References

[1] Le Roux, Jonathan, et al. “SDR half-baked or well done.” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute()[source]

Computes average SI-SDR.

Return type: Tensor

update(preds, target)[source]

Update state with predictions and targets.

Parameters

preds¶ (Tensor) – Predictions from model
target¶ (Tensor) – Ground truth values

Return type

None

Functional Interface¶

torchmetrics.functional.scale_invariant_signal_distortion_ratio(preds, target, zero_mean=False)[source]

Calculates Scale-invariant signal-to-distortion ratio (SI-SDR) metric. The SI-SDR value is in general considered an overall measure of how good a source sound.

Parameters

preds¶ (Tensor) – shape [...,time]
target¶ (Tensor) – shape [...,time]
zero_mean¶ (bool) – If to zero mean target and preds or not

Return type

Tensor

Returns

si-sdr value of shape […]

Example

>>> from torchmetrics.functional.audio import scale_invariant_signal_distortion_ratio
>>> target = torch.tensor([3.0, -0.5, 2.0, 7.0])
>>> preds = torch.tensor([2.5, 0.0, 2.0, 8.0])
>>> scale_invariant_signal_distortion_ratio(preds, target)
tensor(18.4030)

References

[1] Le Roux, Jonathan, et al. “SDR half-baked or well done.” IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2019.