ROUGE Score¶
Module Interface¶
- class torchmetrics.text.rouge.ROUGEScore(use_stemmer=False, normalizer=None, tokenizer=None, accumulate='best', rouge_keys=('rouge1', 'rouge2', 'rougeL', 'rougeLsum'), compute_on_step=None, **kwargs)[source]
Calculate Rouge Score, used for automatic summarization.
This implementation should imitate the behaviour of the rouge-score package Python ROUGE Implementation
- Parameters
use_stemmer¶ (
bool) – Use Porter stemmer to strip word suffixes to improve matching.normalizer¶ (
Optional[Callable[[str],str]]) – A user’s own normalizer function. If this isNone, replacing any non-alpha-numeric characters with spaces is default. This function must take astrand return astr.tokenizer¶ (
Optional[Callable[[str],Sequence[str]]]) – A user’s own tokenizer function. If this isNone, spliting by spaces is default This function must take a str and returnSequence[str]accumulate¶ (
Literal[‘avg’, ‘best’]) –Useful in case of multi-reference rouge score.
avgtakes the avg of all references with respect to predictionsbesttakes the best fmeasure score obtained between prediction and multiple corresponding references.
rouge_keys¶ (
Union[str,Tuple[str,...]]) – A list of rouge types to calculate. Keys that are allowed arerougeL,rougeLsum, androuge1throughrouge9.compute_on_step¶ (
Optional[bool]) –Forward only calls
update()and returns None if this is set to False.Deprecated since version v0.8: Argument has no use anymore and will be removed v0.9.
kwargs¶ (
Dict[str,Any]) – Additional keyword arguments, see Advanced metric settings for more info.
Example
>>> from torchmetrics.text.rouge import ROUGEScore >>> preds = "My name is John" >>> target = "Is your name John" >>> rouge = ROUGEScore() >>> from pprint import pprint >>> pprint(rouge(preds, target)) {'rouge1_fmeasure': tensor(0.7500), 'rouge1_precision': tensor(0.7500), 'rouge1_recall': tensor(0.7500), 'rouge2_fmeasure': tensor(0.), 'rouge2_precision': tensor(0.), 'rouge2_recall': tensor(0.), 'rougeL_fmeasure': tensor(0.5000), 'rougeL_precision': tensor(0.5000), 'rougeL_recall': tensor(0.5000), 'rougeLsum_fmeasure': tensor(0.5000), 'rougeLsum_precision': tensor(0.5000), 'rougeLsum_recall': tensor(0.5000)}
- Raises
ValueError – If the python packages
nltkis not installed.ValueError – If any of the
rouge_keysdoes not belong to the allowed set of keys.
References
[1] ROUGE: A Package for Automatic Evaluation of Summaries by Chin-Yew Lin Rouge Detail
Initializes internal Module state, shared by both nn.Module and ScriptModule.
- compute()[source]
Calculate (Aggregate and provide confidence intervals) ROUGE score.
- update(preds, target)[source]
Compute rouge scores.
Functional Interface¶
- torchmetrics.functional.text.rouge.rouge_score(preds, target, accumulate='best', use_stemmer=False, normalizer=None, tokenizer=None, rouge_keys=('rouge1', 'rouge2', 'rougeL', 'rougeLsum'))[source]
Calculate Calculate Rouge Score , used for automatic summarization.
- Parameters
preds¶ (
Union[str,Sequence[str]]) – An iterable of predicted sentences or a single predicted sentence.target¶ (
Union[str,Sequence[str],Sequence[Sequence[str]]]) – An iterable of iterables of target sentences or an iterable of target sentences or a single target sentence.accumulate¶ (
Literal[‘avg’, ‘best’]) –Useful incase of multi-reference rouge score.
avgtakes the avg of all references with respect to predictionsbesttakes the best fmeasure score obtained between prediction and multiple corresponding references.
use_stemmer¶ (
bool) – Use Porter stemmer to strip word suffixes to improve matching.normalizer¶ (
Optional[Callable[[str],str]]) – A user’s own normalizer function. If this isNone, replacing any non-alpha-numeric characters with spaces is default. This function must take astrand return astr.tokenizer¶ (
Optional[Callable[[str],Sequence[str]]]) – A user’s own tokenizer function. If this isNone, spliting by spaces is default This function must take astrand returnSequence[str]rouge_keys¶ (
Union[str,Tuple[str,...]]) – A list of rouge types to calculate. Keys that are allowed arerougeL,rougeLsum, androuge1throughrouge9.
- Return type
- Returns
Python dictionary of rouge scores for each input rouge key.
Example
>>> from torchmetrics.functional.text.rouge import rouge_score >>> preds = "My name is John" >>> target = "Is your name John" >>> from pprint import pprint >>> pprint(rouge_score(preds, target)) {'rouge1_fmeasure': tensor(0.7500), 'rouge1_precision': tensor(0.7500), 'rouge1_recall': tensor(0.7500), 'rouge2_fmeasure': tensor(0.), 'rouge2_precision': tensor(0.), 'rouge2_recall': tensor(0.), 'rougeL_fmeasure': tensor(0.5000), 'rougeL_precision': tensor(0.5000), 'rougeL_recall': tensor(0.5000), 'rougeLsum_fmeasure': tensor(0.5000), 'rougeLsum_precision': tensor(0.5000), 'rougeLsum_recall': tensor(0.5000)}
- Raises
ModuleNotFoundError – If the python package
nltkis not installed.ValueError – If any of the
rouge_keysdoes not belong to the allowed set of keys.
References
[1] ROUGE: A Package for Automatic Evaluation of Summaries by Chin-Yew Lin. https://aclanthology.org/W04-1013/