Shortcuts

Translation Edit Rate (TER)

Module Interface

class torchmetrics.TranslationEditRate(normalize=False, no_punctuation=False, lowercase=True, asian_support=False, return_sentence_level_score=False, **kwargs)[source]

Calculate Translation edit rate (TER) of machine translated text with one or more references.

This implementation follows the one from SacreBleu_ter, which is a near-exact reimplementation of the Tercom algorithm, produces identical results on all “sane” outputs.

As input to forward and update the metric accepts the following input:

  • preds (Sequence): An iterable of hypothesis corpus

  • target (Sequence): An iterable of iterables of reference corpus

As output of forward and compute the metric returns the following output:

  • ter (Tensor): if return_sentence_level_score=True return a corpus-level translation edit rate with a list of sentence-level translation_edit_rate, else return a corpus-level translation edit rate

Parameters
  • normalize (bool) – An indication whether a general tokenization to be applied.

  • no_punctuation (bool) – An indication whteher a punctuation to be removed from the sentences.

  • lowercase (bool) – An indication whether to enable case-insesitivity.

  • asian_support (bool) – An indication whether asian characters to be processed.

  • return_sentence_level_score (bool) – An indication whether a sentence-level TER to be returned.

  • kwargs (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Example

>>> preds = ['the cat is on the mat']
>>> target = [['there is a cat on the mat', 'a cat is on the mat']]
>>> ter = TranslationEditRate()
>>> ter(preds, target)
tensor(0.1538)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Functional Interface

torchmetrics.functional.translation_edit_rate(preds, target, normalize=False, no_punctuation=False, lowercase=True, asian_support=False, return_sentence_level_score=False)[source]

Calculate Translation edit rate (TER) of machine translated text with one or more references. This implementation follows the implmenetaions from https://github.com/mjpost/sacrebleu/blob/master/sacrebleu/metrics/ter.py. The sacrebleu implmenetation is a near-exact reimplementation of the Tercom algorithm, produces identical results on all “sane” outputs.

Parameters
  • preds (Union[str, Sequence[str]]) – An iterable of hypothesis corpus.

  • target (Sequence[Union[str, Sequence[str]]]) – An iterable of iterables of reference corpus.

  • normalize (bool) – An indication whether a general tokenization to be applied.

  • no_punctuation (bool) – An indication whteher a punctuation to be removed from the sentences.

  • lowercase (bool) – An indication whether to enable case-insesitivity.

  • asian_support (bool) – An indication whether asian characters to be processed.

  • return_sentence_level_score (bool) – An indication whether a sentence-level TER to be returned.

Return type

Union[Tensor, Tuple[Tensor, List[Tensor]]]

Returns

A corpus-level translation edit rate (TER). (Optionally) A list of sentence-level translation_edit_rate (TER) if return_sentence_level_score=True.

Example

>>> preds = ['the cat is on the mat']
>>> target = [['there is a cat on the mat', 'a cat is on the mat']]
>>> translation_edit_rate(preds, target)
tensor(0.1538)

References

[1] A Study of Translation Edit Rate with Targeted Human Annotation by Mathew Snover, Bonnie Dorr, Richard Schwartz, Linnea Micciulla and John Makhoul TER

Read the Docs v: v0.11.2
Versions
latest
stable
v0.11.2
v0.11.1
v0.11.0
v0.10.3
v0.10.2
v0.10.1
v0.10.0
v0.9.3
v0.9.2
v0.9.1
v0.9.0
v0.8.2
v0.8.1
v0.8.0
v0.7.3
v0.7.2
v0.7.1
v0.7.0
v0.6.2
v0.6.1
v0.6.0
v0.5.1
v0.5.0
v0.4.1
v0.4.0
v0.3.2
v0.3.1
v0.3.0
v0.2.0
v0.1.0
Downloads
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.