InfoLM¶

Module Interface¶

class torchmetrics.text.infolm.InfoLM(model_name_or_path='bert-base-uncased', temperature=0.25, information_measure='kl_divergence', idf=True, alpha=None, beta=None, device=None, max_length=None, batch_size=64, num_threads=0, verbose=True, return_sentence_level_score=False, **kwargs)[source]

Calculate InfoLM - i.e. calculate a distance/divergence between predicted and reference sentence discrete distribution using one of the following information measures:

KL divergence

alpha divergence

beta divergence

AB divergence

Rényi divergence

L1 distance

L2 distance

L-infinity distance

Fisher-Rao distance

InfoLM is a family of untrained embedding-based metrics which addresses some famous flaws of standard string-based metrics thanks to the usage of pre-trained masked language models. This family of metrics is mainly designed for summarization and data-to-text tasks.

The implementation of this metric is fully based HuggingFace transformers’ package.

As input to forward and update the metric accepts the following input:

preds (Sequence): An iterable of hypothesis corpus
target (Sequence): An iterable of reference corpus

As output of forward and compute the metric returns the following output:

infolm (Tensor): If return_sentence_level_score=True return a tuple with a tensor with the corpus-level InfoLM score and a list of sentence-level InfoLM scores, else return a corpus-level InfoLM score

Parameters

model_name_or_path¶ (Union[str, PathLike]) – A name or a model path used to load transformers pretrained model. By default the “bert-base-uncased” model is used.
temperature¶ (float) – A temperature for calibrating language modelling. For more information, please reference InfoLM paper.
information_measure¶ (Literal[‘kl_divergence’, ‘alpha_divergence’, ‘beta_divergence’, ‘ab_divergence’, ‘renyi_divergence’, ‘l1_distance’, ‘l2_distance’, ‘l_infinity_distance’, ‘fisher_rao_distance’]) – A name of information measure to be used. Please use one of: [‘kl_divergence’, ‘alpha_divergence’, ‘beta_divergence’, ‘ab_divergence’, ‘renyi_divergence’, ‘l1_distance’, ‘l2_distance’, ‘l_infinity_distance’, ‘fisher_rao_distance’]
idf¶ (bool) – An indication of whether normalization using inverse document frequencies should be used.
alpha¶ (Optional[float]) – Alpha parameter of the divergence used for alpha, AB and Rényi divergence measures.
beta¶ (Optional[float]) – Beta parameter of the divergence used for beta and AB divergence measures.
device¶ (Union[str, device, None]) – A device to be used for calculation.
max_length¶ (Optional[int]) – A maximum length of input sequences. Sequences longer than max_length are to be trimmed.
batch_size¶ (int) – A batch size used for model processing.
num_threads¶ (int) – A number of threads to use for a dataloader.
verbose¶ (bool) – An indication of whether a progress bar to be displayed during the embeddings calculation.
return_sentence_level_score¶ (bool) – An indication whether a sentence-level InfoLM score to be returned.

Example

>>> from torchmetrics.text.infolm import InfoLM
>>> preds = ['he read the book because he was interested in world history']
>>> target = ['he was interested in world history because he read the book']
>>> infolm = InfoLM('google/bert_uncased_L-2_H-128_A-2', idf=False)
>>> infolm(preds, target)
tensor(-0.1784)

Initializes internal Module state, shared by both nn.Module and ScriptModule.

Functional Interface¶

torchmetrics.functional.text.infolm.infolm(preds, target, model_name_or_path='bert-base-uncased', temperature=0.25, information_measure='kl_divergence', idf=True, alpha=None, beta=None, device=None, max_length=None, batch_size=64, num_threads=0, verbose=True, return_sentence_level_score=False)[source]¶

Calculate InfoLM [1] - i.e. calculate a distance/divergence between predicted and reference sentence discrete distribution using one of the following information measures:

KL divergence

alpha divergence

beta divergence

AB divergence

Rényi divergence

L1 distance

L2 distance

L-infinity distance

Fisher-Rao distance

InfoLM is a family of untrained embedding-based metrics which addresses some famous flaws of standard string-based metrics thanks to the usage of pre-trained masked language models. This family of metrics is mainly designed for summarization and data-to-text tasks.

If you want to use IDF scaling over the whole dataset, please use the class metric.

The implementation of this metric is fully based HuggingFace transformers’ package.

Parameters

preds¶ (Union[str, Sequence[str]]) – An iterable of hypothesis corpus.
target¶ (Union[str, Sequence[str]]) – An iterable of reference corpus.
model_name_or_path¶ (Union[str, PathLike]) – A name or a model path used to load transformers pretrained model.
temperature¶ (float) – A temperature for calibrating language modelling. For more information, please reference InfoLM paper.
information_measure¶ (Literal[‘kl_divergence’, ‘alpha_divergence’, ‘beta_divergence’, ‘ab_divergence’, ‘renyi_divergence’, ‘l1_distance’, ‘l2_distance’, ‘l_infinity_distance’, ‘fisher_rao_distance’]) – A name of information measure to be used. Please use one of: [‘kl_divergence’, ‘alpha_divergence’, ‘beta_divergence’, ‘ab_divergence’, ‘renyi_divergence’, ‘l1_distance’, ‘l2_distance’, ‘l_infinity_distance’, ‘fisher_rao_distance’]
idf¶ (bool) – An indication of whether normalization using inverse document frequencies should be used.
alpha¶ (Optional[float]) – Alpha parameter of the divergence used for alpha, AB and Rényi divergence measures.
beta¶ (Optional[float]) – Beta parameter of the divergence used for beta and AB divergence measures.
device¶ (Union[str, device, None]) – A device to be used for calculation.
max_length¶ (Optional[int]) – A maximum length of input sequences. Sequences longer than max_length are to be trimmed.
batch_size¶ (int) – A batch size used for model processing.
num_threads¶ (int) – A number of threads to use for a dataloader.
verbose¶ (bool) – An indication of whether a progress bar to be displayed during the embeddings calculation.
return_sentence_level_score¶ (bool) – An indication whether a sentence-level InfoLM score to be returned.

Return type

Union[Tensor, Tuple[Tensor, Tensor]]

Returns

A corpus-level InfoLM score. (Optionally) A list of sentence-level InfoLM scores if return_sentence_level_score=True.

Example

>>> from torchmetrics.functional.text.infolm import infolm
>>> preds = ['he read the book because he was interested in world history']
>>> target = ['he was interested in world history because he read the book']
>>> infolm(preds, target, model_name_or_path='google/bert_uncased_L-2_H-128_A-2', idf=False)
tensor(-0.1784)

References

[1] InfoLM: A New Metric to Evaluate Summarization & Data2Text Generation by Pierre Colombo, Chloé Clavel and Pablo Piantanida InfoLM