Perceptual Evaluation of Speech Quality (PESQ)¶

Module Interface¶

class torchmetrics.audio.pesq.PerceptualEvaluationSpeechQuality(fs, mode, n_processes=1, **kwargs)[source]¶

Perceptual Evaluation of Speech Quality (PESQ)

This is a wrapper for the pesq package [1]. Note that input will be moved to cpu to perform the metric calculation.

Note

using this metrics requires you to have pesq install. Either install as pip install torchmetrics[audio] or pip install pesq. Note that pesq will compile with your currently installed version of numpy, meaning that if you upgrade numpy at some point in the future you will most likely have to reinstall pesq.

Forward accepts

preds: shape [...,time]
target: shape [...,time]

Parameters

fs¶ (int) – sampling frequency, should be 16000 or 8000 (Hz)
mode¶ (str) – 'wb' (wide-band) or 'nb' (narrow-band)
keep_same_device¶ – whether to move the pesq value to the device of preds
n_processes¶ (int) – integer specifiying the number of processes to run in parallel for the metric calculation. Only applies to batches of data and if multiprocessing package is installed.
kwargs¶ (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Raises

ModuleNotFoundError – If peqs package is not installed
ValueError – If fs is not either 8000 or 16000
ValueError – If mode is not either "wb" or "nb"

Example

>>> from torchmetrics.audio.pesq import PerceptualEvaluationSpeechQuality
>>> import torch
>>> g = torch.manual_seed(1)
>>> preds = torch.randn(8000)
>>> target = torch.randn(8000)
>>> nb_pesq = PerceptualEvaluationSpeechQuality(8000, 'nb')
>>> nb_pesq(preds, target)
tensor(2.2076)
>>> wb_pesq = PerceptualEvaluationSpeechQuality(16000, 'wb')
>>> wb_pesq(preds, target)
tensor(1.7359)

References

[1] https://github.com/ludlows/python-pesq

Initializes internal Module state, shared by both nn.Module and ScriptModule.

compute()[source]¶

Computes average PESQ.

Return type: Tensor

update(preds, target)[source]¶

Update state with predictions and targets.

Parameters

preds¶ (Tensor) – Predictions from model
target¶ (Tensor) – Ground truth values

Return type

None

Functional Interface¶

torchmetrics.functional.audio.pesq.perceptual_evaluation_speech_quality(preds, target, fs, mode, keep_same_device=False, n_processes=1)[source]¶

PESQ (Perceptual Evaluation of Speech Quality)

This is a wrapper for the pesq package [1]. Note that input will be moved to cpu to perform the metric calculation.

Note

using this metrics requires you to have pesq install. Either install as pip install torchmetrics[audio] or pip install pesq. Note that pesq will compile with your currently installed version of numpy, meaning that if you upgrade numpy at some point in the future you will most likely have to reinstall pesq.

Parameters

preds¶ (Tensor) – shape [...,time]
target¶ (Tensor) – shape [...,time]
fs¶ (int) – sampling frequency, should be 16000 or 8000 (Hz)
mode¶ (str) – 'wb' (wide-band) or 'nb' (narrow-band)
keep_same_device¶ (bool) – whether to move the pesq value to the device of preds
n_processes¶ (int) – integer specifiying the number of processes to run in parallel for the metric calculation. Only applies to batches of data and if multiprocessing package is installed.

Return type

Tensor

Returns

pesq value of shape […]

Raises

ModuleNotFoundError – If peqs package is not installed
ValueError – If fs is not either 8000 or 16000
ValueError – If mode is not either "wb" or "nb"

Example

>>> from torchmetrics.functional.audio.pesq import perceptual_evaluation_speech_quality
>>> import torch
>>> g = torch.manual_seed(1)
>>> preds = torch.randn(8000)
>>> target = torch.randn(8000)
>>> perceptual_evaluation_speech_quality(preds, target, 8000, 'nb')
tensor(2.2076)
>>> perceptual_evaluation_speech_quality(preds, target, 16000, 'wb')
tensor(1.7359)

References

[1] https://github.com/ludlows/python-pesq