nltk.translate.meteor¶

nltk.translate.meteor(references: ~typing.Iterable[~typing.Iterable[str]], hypothesis: ~typing.Iterable[str], preprocess: ~typing.Callable[[str], str] = <method 'lower' of 'str' objects>, stemmer: ~nltk.stem.api.StemmerI = <PorterStemmer>, wordnet: ~nltk.corpus.reader.wordnet.WordNetCorpusReader = <WordNetCorpusReader in 'C:\\Users\\Tom\\AppData\\Roaming\\nltk_data\\corpora\\wordnet'>, alpha: float = 0.9, beta: float = 3.0, gamma: float = 0.5) → float[source]¶

Calculates METEOR score for hypothesis with multiple references as described in “Meteor: An Automatic Metric for MT Evaluation with HighLevels of Correlation with Human Judgments” by Alon Lavie and Abhaya Agarwal, in Proceedings of ACL. https://www.cs.cmu.edu/~alavie/METEOR/pdf/Lavie-Agarwal-2007-METEOR.pdf

In case of multiple references the best score is chosen. This method iterates over single_meteor_score and picks the best pair among all the references for a given hypothesis

>>> hypothesis1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'which', 'ensures', 'that', 'the', 'military', 'always', 'obeys', 'the', 'commands', 'of', 'the', 'party']
>>> hypothesis2 = ['It', 'is', 'to', 'insure', 'the', 'troops', 'forever', 'hearing', 'the', 'activity', 'guidebook', 'that', 'party', 'direct']

>>> reference1 = ['It', 'is', 'a', 'guide', 'to', 'action', 'that', 'ensures', 'that', 'the', 'military', 'will', 'forever', 'heed', 'Party', 'commands']
>>> reference2 = ['It', 'is', 'the', 'guiding', 'principle', 'which', 'guarantees', 'the', 'military', 'forces', 'always', 'being', 'under', 'the', 'command', 'of', 'the', 'Party']
>>> reference3 = ['It', 'is', 'the', 'practical', 'guide', 'for', 'the', 'army', 'always', 'to', 'heed', 'the', 'directions', 'of', 'the', 'party']

>>> round(meteor_score([reference1, reference2, reference3], hypothesis1),4)
0.7398

If there is no words match during the alignment the method returns the score as 0. We can safely return a zero instead of raising a division by zero error as no match usually implies a bad translation.

>>> round(meteor_score([['this', 'is', 'a', 'cat']], ['non', 'matching', 'hypothesis']),4)
0.0

Parameters

references (Iterable[Iterable[str]]) – pre-tokenized reference sentences
hypothesis (Iterable[str]) – a pre-tokenized hypothesis sentence
preprocess (Callable[[str], str]) – preprocessing function (default str.lower)
stemmer (StemmerI) – nltk.stem.api.StemmerI object (default PorterStemmer())
wordnet (WordNetCorpusReader) – a wordnet corpus reader object (default nltk.corpus.wordnet)
alpha (float) – parameter for controlling relative weights of precision and recall.
beta (float) – parameter for controlling shape of penalty as a function of as a function of fragmentation.
gamma (float) – relative weight assigned to fragmentation penalty.

Returns

The sentence-level METEOR score.

Return type

float

NLTK

Documentation

nltk.translate.meteor¶