Scale-Invariant Signal-to-Noise Ratio (SI-SNR)¶

Module Interface¶

class torchmetrics.ScaleInvariantSignalNoiseRatio(compute_on_step=None, **kwargs)[source]

Scale-invariant signal-to-noise ratio (SI-SNR).

Forward accepts

preds: shape [...,time]
target: shape [...,time]

Parameters

compute_on_step¶ (Optional[bool]) –
Forward only calls update() and returns None if this is set to False.

Deprecated since version v0.8: Argument has no use anymore and will be removed v0.9.
kwargs¶ (Dict[str, Any]) – Additional keyword arguments, see Advanced metric settings for more info.

Raises

TypeError – if target and preds have a different shape

Returns

average si-snr value

Example

>>> import torch
>>> from torchmetrics import ScaleInvariantSignalNoiseRatio
>>> target = torch.tensor([3.0, -0.5, 2.0, 7.0])
>>> preds = torch.tensor([2.5, 0.0, 2.0, 8.0])
>>> si_snr = ScaleInvariantSignalNoiseRatio()
>>> si_snr(preds, target)
tensor(15.0918)

References

[1] Y. Luo and N. Mesgarani, “TaSNet: Time-Domain Audio Separation Network for Real-Time, Single-Channel Speech Separation,” 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 696-700, doi: 10.1109/ICASSP.2018.8462116.

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute()[source]

Computes average SI-SNR.

Return type: Tensor

update(preds, target)[source]

Update state with predictions and targets.

Parameters

preds¶ (Tensor) – Predictions from model
target¶ (Tensor) – Ground truth values

Return type

None

Functional Interface¶

torchmetrics.functional.scale_invariant_signal_noise_ratio(preds, target)[source]

Scale-invariant signal-to-noise ratio (SI-SNR).

Parameters

preds¶ (Tensor) – shape [...,time]
target¶ (Tensor) – shape [...,time]

Return type

Tensor

Returns

si-snr value of shape […]

Example

>>> import torch
>>> from torchmetrics.functional.audio import scale_invariant_signal_noise_ratio
>>> target = torch.tensor([3.0, -0.5, 2.0, 7.0])
>>> preds = torch.tensor([2.5, 0.0, 2.0, 8.0])
>>> scale_invariant_signal_noise_ratio(preds, target)
tensor(15.0918)

References

[1] Y. Luo and N. Mesgarani, “TaSNet: Time-Domain Audio Separation Network for Real-Time, Single-Channel Speech Separation,” 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2018, pp. 696-700, doi: 10.1109/ICASSP.2018.8462116.