NLTK

Documentation

Metrics¶

Distance metrics¶

`aline`	ALINE https://webdocs.cs.ualberta.ca/~kondrak/ Copyright 2002 by Grzegorz Kondrak.
`binary_distance`(label1, label2)	Simple equality test.
`custom_distance`(file)
`edit_distance`(s1, s2[, substitution_cost, ...])	Calculate the Levenshtein edit-distance between two strings.
`edit_distance_align`(s1, s2[, substitution_cost])	Calculate the minimum Levenshtein edit-distance based alignment mapping between two strings.
`fractional_presence`(label)
`interval_distance`(label1, label2)	Krippendorff's interval distance metric
`jaccard_distance`(label1, label2)	Distance metric comparing set-similarity.
`masi_distance`(label1, label2)	Distance metric that takes into account partial agreement when multiple labels are assigned.
`presence`(label)	Higher-order function to test presence of a given label

Scores¶

`AnnotationTask`	Represents an annotation task, i.e. people assign labels to items.
`ConfusionMatrix`	The confusion matrix between a list of reference values and a corresponding list of test values.
`Paice`	Class for storing lemmas, stems and evaluation metrics.
`accuracy`(reference, test)	Given a list of reference values and a corresponding list of test values, return the fraction of corresponding values that are equal.
`approxrand`(a, b, **kwargs)	Returns an approximate significance level between two lists of independently generated test values.
`f_measure`(reference, test[, alpha])	Given a set of reference values and a set of test values, return the f-measure of the test values, when compared against the reference values.
`log_likelihood`(reference, test)	Given a list of reference values and a corresponding list of test probability distributions, return the average log likelihood of the reference values, given the probability distributions.
`precision`(reference, test)	Given a set of reference values and a set of test values, return the fraction of test values that appear in the reference set.
`recall`(reference, test)	Given a set of reference values and a set of test values, return the fraction of reference values that appear in the test set.

Segmentation¶

`ghd`(ref, hyp[, ins_cost, del_cost, ...])	Compute the Generalized Hamming Distance for a reference and a hypothetical segmentation, corresponding to the cost related to the transformation of the hypothetical segmentation into the reference segmentation through boundary insertion, deletion and shift operations.
`pk`(ref, hyp[, k, boundary])	Compute the Pk metric for a pair of segmentations A segmentation is any sequence over a vocabulary of two items (e.g.
`windowdiff`(seg1, seg2, k[, boundary, weighted])	Compute the windowdiff score for a pair of segmentations.

Spearman¶

`ranks_from_scores`(scores[, rank_gap])	Given a sequence of (key, score) tuples, yields each key with an increasing rank, tying with previous key's rank if the difference between their scores is less than rank_gap.
`ranks_from_sequence`(seq)	Given a sequence, yields each element with an increasing rank, suitable for use as an argument to `spearman_correlation`.
`spearman_correlation`(ranks1, ranks2)	Returns the Spearman correlation coefficient for two rankings, which should be dicts or sequences of (key, rank).

Translation¶

`bleu`(references, hypothesis[, weights, ...])	Calculate BLEU score (Bilingual Evaluation Understudy) from Papineni, Kishore, Salim Roukos, Todd Ward, and Wei-Jing Zhu.
`ribes`(references, hypothesis[, alpha, beta])	The RIBES (Rank-based Intuitive Bilingual Evaluation Score) from Hideki Isozaki, Tsutomu Hirao, Kevin Duh, Katsuhito Sudoh and Hajime Tsukada.
`meteor`(references, hypothesis[, preprocess, ...])	Calculates METEOR score for hypothesis with multiple references as described in "Meteor: An Automatic Metric for MT Evaluation with HighLevels of Correlation with Human Judgments" by Alon Lavie and Abhaya Agarwal, in Proceedings of ACL.
`alignment_error_rate`(reference, hypothesis)	Return the Alignment Error Rate (AER) of an alignment with respect to a "gold standard" reference alignment.
`nist`(references, hypothesis[, n])	Calculate NIST score from George Doddington.
`chrf`(reference, hypothesis[, min_len, ...])	Calculates the sentence level CHRF (Character n-gram F-score) described in
`gleu`(references, hypothesis[, min_len, max_len])	Calculates the sentence level GLEU (Google-BLEU) score described in