Shortcuts

Perceptual Evaluation of Speech Quality (PESQ)

Module Interface

class torchmetrics.audio.pesq.PerceptualEvaluationSpeechQuality(fs, mode, n_processes=1, **kwargs)[source]

Calculate Perceptual Evaluation of Speech Quality (PESQ).

It’s a recognized industry standard for audio quality that takes into considerations characteristics such as: audio sharpness, call volume, background noise, clipping, audio interference ect. PESQ returns a score between -0.5 and 4.5 with the higher scores indicating a better quality.

This metric is a wrapper for the pesq package. Note that input will be moved to cpu to perform the metric calculation.

As input to forward and update the metric accepts the following input

  • preds (Tensor): float tensor with shape (...,time)

  • target (Tensor): float tensor with shape (...,time)

As output of forward and compute the metric returns the following output

  • pesq (Tensor): float tensor with shape (...,) of PESQ value per sample

Note

using this metrics requires you to have pesq install. Either install as pip install torchmetrics[audio] or pip install pesq. pesq will compile with your currently installed version of numpy, meaning that if you upgrade numpy at some point in the future you will most likely have to reinstall pesq.

Parameters
  • fs (int) – sampling frequency, should be 16000 or 8000 (Hz)

  • mode (str) – 'wb' (wide-band) or 'nb' (narrow-band)

  • keep_same_device – whether to move the pesq value to the device of preds

  • n_processes (int) – integer specifiying the number of processes to run in parallel for the metric calculation. Only applies to batches of data and if multiprocessing package is installed.

  • kwargs (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Raises

Example

>>> import torch
>>> from torchmetrics.audio.pesq import PerceptualEvaluationSpeechQuality
>>> g = torch.manual_seed(1)
>>> preds = torch.randn(8000)
>>> target = torch.randn(8000)
>>> nb_pesq = PerceptualEvaluationSpeechQuality(8000, 'nb')
>>> nb_pesq(preds, target)
tensor(2.2076)
>>> wb_pesq = PerceptualEvaluationSpeechQuality(16000, 'wb')
>>> wb_pesq(preds, target)
tensor(1.7359)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

plot(val=None, ax=None)[source]

Plot a single or multiple values from the metric.

Parameters
  • val (Union[Tensor, Sequence[Tensor], None]) – Either a single result from calling metric.forward or metric.compute or a list of these results. If no value is provided, will automatically call metric.compute and plot that result.

  • ax (Optional[Axes]) – An matplotlib axis object. If provided will add plot to that axis

Return type

Tuple[Figure, Union[Axes, ndarray]]

Returns

Figure and Axes object

Raises

ModuleNotFoundError – If matplotlib is not installed

>>> # Example plotting a single value
>>> import torch
>>> from torchmetrics.audio.pesq import PerceptualEvaluationSpeechQuality
>>> metric = PerceptualEvaluationSpeechQuality(8000, 'nb')
>>> metric.update(torch.rand(8000), torch.rand(8000))
>>> fig_, ax_ = metric.plot()

(Source code, png, hires.png, pdf)

../_images/perceptual_evaluation_speech_quality-1.png
>>> # Example plotting multiple values
>>> import torch
>>> from torchmetrics.audio.pesq import PerceptualEvaluationSpeechQuality
>>> metric = PerceptualEvaluationSpeechQuality(8000, 'nb')
>>> values = [ ]
>>> for _ in range(10):
...     values.append(metric(torch.rand(8000), torch.rand(8000)))
>>> fig_, ax_ = metric.plot(values)

(Source code, png, hires.png, pdf)

../_images/perceptual_evaluation_speech_quality-2.png

Functional Interface

torchmetrics.functional.audio.pesq.perceptual_evaluation_speech_quality(preds, target, fs, mode, keep_same_device=False, n_processes=1)[source]

Calculate Perceptual Evaluation of Speech Quality (PESQ).

It’s a recognized industry standard for audio quality that takes into considerations characteristics such as: audio sharpness, call volume, background noise, clipping, audio interference ect. PESQ returns a score between -0.5 and 4.5 with the higher scores indicating a better quality.

This metric is a wrapper for the pesq package. Note that input will be moved to cpu to perform the metric calculation.

Note

using this metrics requires you to have pesq install. Either install as pip install torchmetrics[audio] or pip install pesq. Note that pesq will compile with your currently installed version of numpy, meaning that if you upgrade numpy at some point in the future you will most likely have to reinstall pesq.

Parameters
  • preds (Tensor) – float tensor with shape (...,time)

  • target (Tensor) – float tensor with shape (...,time)

  • fs (int) – sampling frequency, should be 16000 or 8000 (Hz)

  • mode (str) – 'wb' (wide-band) or 'nb' (narrow-band)

  • keep_same_device (bool) – whether to move the pesq value to the device of preds

  • n_processes (int) – integer specifiying the number of processes to run in parallel for the metric calculation. Only applies to batches of data and if multiprocessing package is installed.

Return type

Tensor

Returns

Float tensor with shape (...,) of PESQ values per sample

Raises

Example

>>> from torch import randn
>>> from torchmetrics.functional.audio.pesq import perceptual_evaluation_speech_quality
>>> g = torch.manual_seed(1)
>>> preds = randn(8000)
>>> target = randn(8000)
>>> perceptual_evaluation_speech_quality(preds, target, 8000, 'nb')
tensor(2.2076)
>>> perceptual_evaluation_speech_quality(preds, target, 16000, 'wb')
tensor(1.7359)
Read the Docs v: latest
Versions
latest
stable
v0.11.4
v0.11.3
v0.11.2
v0.11.1
v0.11.0
v0.10.3
v0.10.2
v0.10.1
v0.10.0
v0.9.3
v0.9.2
v0.9.1
v0.9.0
v0.8.2
v0.8.1
v0.8.0
v0.7.3
v0.7.2
v0.7.1
v0.7.0
v0.6.2
v0.6.1
v0.6.0
v0.5.1
v0.5.0
v0.4.1
v0.4.0
v0.3.2
v0.3.1
v0.3.0
v0.2.0
v0.1.0
Downloads
pdf
html
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.