Extended Edit Distance¶

Module Interface¶

class torchmetrics.ExtendedEditDistance(language='en', return_sentence_level_score=False, alpha=2.0, rho=0.3, deletion=0.2, insertion=1.0, **kwargs)[source]

Computes extended edit distance score (ExtendedEditDistance) for strings or list of strings.

The metric utilises the Levenshtein distance and extends it by adding a jump operation.

As input to forward and update the metric accepts the following input:

preds (Sequence): An iterable of hypothesis corpus
target (Sequence): An iterable of iterables of reference corpus

As output of forward and compute the metric returns the following output:

eed (Tensor): A tensor with the extended edit distance score

Parameters

language¶ (Literal[‘en’, ‘ja’]) – Language used in sentences. Only supports English (en) and Japanese (ja) for now.
return_sentence_level_score¶ (bool) – An indication of whether sentence-level EED score is to be returned
alpha¶ (float) – optimal jump penalty, penalty for jumps between characters
rho¶ (float) – coverage cost, penalty for repetition of characters
deletion¶ (float) – penalty for deletion of character
insertion¶ (float) – penalty for insertion or substitution of character
kwargs¶ (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Example

>>> from torchmetrics import ExtendedEditDistance
>>> preds = ["this is the prediction", "here is an other sample"]
>>> target = ["this is the reference", "here is another one"]
>>> eed = ExtendedEditDistance()
>>> eed(preds=preds, target=target)
tensor(0.3078)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Functional Interface¶

torchmetrics.functional.extended_edit_distance(preds, target, language='en', return_sentence_level_score=False, alpha=2.0, rho=0.3, deletion=0.2, insertion=1.0)[source]

Computes extended edit distance score (ExtendedEditDistance) [1] for strings or list of strings. The metric utilises the Levenshtein distance and extends it by adding a jump operation.

Parameters

preds¶ (Union[str, Sequence[str]]) – An iterable of hypothesis corpus.
target¶ (Sequence[Union[str, Sequence[str]]]) – An iterable of iterables of reference corpus.
language¶ (Literal[‘en’, ‘ja’]) – Language used in sentences. Only supports English (en) and Japanese (ja) for now. Defaults to en
return_sentence_level_score¶ (bool) – An indication of whether sentence-level EED score is to be returned.
alpha¶ (float) – optimal jump penalty, penalty for jumps between characters
rho¶ (float) – coverage cost, penalty for repetition of characters
deletion¶ (float) – penalty for deletion of character
insertion¶ (float) – penalty for insertion or substitution of character

Return type

Union[Tensor, Tuple[Tensor, Tensor]]

Returns

Extended edit distance score as a tensor

Example

>>> from torchmetrics.functional import extended_edit_distance
>>> preds = ["this is the prediction", "here is an other sample"]
>>> target = ["this is the reference", "here is another one"]
>>> extended_edit_distance(preds=preds, target=target)
tensor(0.3078)

References

[1] P. Stanchev, W. Wang, and H. Ney, “EED: Extended Edit Distance Measure for Machine Translation”, submitted to WMT 2019. ExtendedEditDistance