Extended Edit Distance

Module Interface

class torchmetrics.text.ExtendedEditDistance(language='en', return_sentence_level_score=False, alpha=2.0, rho=0.3, deletion=0.2, insertion=1.0, **kwargs)[source]

Compute extended edit distance score (ExtendedEditDistance) for strings or list of strings.

The metric utilises the Levenshtein distance and extends it by adding a jump operation.

As input to forward and update the metric accepts the following input:

  • preds (Sequence): An iterable of hypothesis corpus

  • target (Sequence): An iterable of iterables of reference corpus

As output of forward and compute the metric returns the following output:

  • eed (Tensor): A tensor with the extended edit distance score

Parameters:
  • language (Literal['en', 'ja']) – Language used in sentences. Only supports English (en) and Japanese (ja) for now.

  • return_sentence_level_score (bool) – An indication of whether sentence-level EED score is to be returned

  • alpha (float) – optimal jump penalty, penalty for jumps between characters

  • rho (float) – coverage cost, penalty for repetition of characters

  • deletion (float) – penalty for deletion of character

  • insertion (float) – penalty for insertion or substitution of character

  • kwargs (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Example

>>> from torchmetrics.text import ExtendedEditDistance
>>> preds = ["this is the prediction", "here is an other sample"]
>>> target = ["this is the reference", "here is another one"]
>>> eed = ExtendedEditDistance()
>>> eed(preds=preds, target=target)
tensor(0.3078)
plot(val=None, ax=None)[source]

Plot a single or multiple values from the metric.

Parameters:
  • val (Union[Tensor, Sequence[Tensor], None]) – Either a single result from calling metric.forward or metric.compute or a list of these results. If no value is provided, will automatically call metric.compute and plot that result.

  • ax (Optional[Axes]) – An matplotlib axis object. If provided will add plot to that axis

Return type:

Tuple[Figure, Union[Axes, ndarray]]

Returns:

Figure and Axes object

Raises:

ModuleNotFoundError – If matplotlib is not installed

>>> # Example plotting a single value
>>> from torchmetrics.text import ExtendedEditDistance
>>> metric = ExtendedEditDistance()
>>> preds = ["this is the prediction", "there is an other sample"]
>>> target = ["this is the reference", "there is another one"]
>>> metric.update(preds, target)
>>> fig_, ax_ = metric.plot()
../_images/extended_edit_distance-1.png
>>> # Example plotting multiple values
>>> from torchmetrics.text import ExtendedEditDistance
>>> metric = ExtendedEditDistance()
>>> preds = ["this is the prediction", "there is an other sample"]
>>> target = ["this is the reference", "there is another one"]
>>> values = [ ]
>>> for _ in range(10):
...     values.append(metric(preds, target))
>>> fig_, ax_ = metric.plot(values)
../_images/extended_edit_distance-2.png

Functional Interface

torchmetrics.functional.text.extended_edit_distance(preds, target, language='en', return_sentence_level_score=False, alpha=2.0, rho=0.3, deletion=0.2, insertion=1.0)[source]

Compute extended edit distance score (ExtendedEditDistance) [1] for strings or list of strings.

The metric utilises the Levenshtein distance and extends it by adding a jump operation.

Parameters:
  • preds (Union[str, Sequence[str]]) – An iterable of hypothesis corpus.

  • target (Sequence[Union[str, Sequence[str]]]) – An iterable of iterables of reference corpus.

  • language (Literal['en', 'ja']) – Language used in sentences. Only supports English (en) and Japanese (ja) for now. Defaults to en

  • return_sentence_level_score (bool) – An indication of whether sentence-level EED score is to be returned.

  • alpha (float) – optimal jump penalty, penalty for jumps between characters

  • rho (float) – coverage cost, penalty for repetition of characters

  • deletion (float) – penalty for deletion of character

  • insertion (float) – penalty for insertion or substitution of character

Return type:

Union[Tensor, Tuple[Tensor, Tensor]]

Returns:

Extended edit distance score as a tensor

Example

>>> from torchmetrics.functional.text import extended_edit_distance
>>> preds = ["this is the prediction", "here is an other sample"]
>>> target = ["this is the reference", "here is another one"]
>>> extended_edit_distance(preds=preds, target=target)
tensor(0.3078)

References

[1] P. Stanchev, W. Wang, and H. Ney, “EED: Extended Edit Distance Measure for Machine Translation”, submitted to WMT 2019. ExtendedEditDistance