nltk.metrics.QuadgramAssocMeasures¶
- class nltk.metrics.QuadgramAssocMeasures[source]¶
Bases:
NgramAssocMeasures
A collection of quadgram association measures. Each association measure is provided as a function with five arguments:
trigram_score_fn(n_iiii, (n_iiix, n_iixi, n_ixii, n_xiii), (n_iixx, n_ixix, n_ixxi, n_xixi, n_xxii, n_xiix), (n_ixxx, n_xixx, n_xxix, n_xxxi), n_all)
The arguments constitute the marginals of a contingency table, counting the occurrences of particular events in a corpus. The letter i in the suffix refers to the appearance of the word in question, while x indicates the appearance of any word. Thus, for example:
n_iiii counts
(w1, w2, w3, w4)
, i.e. the quadgram being scoredn_ixxi counts
(w1, *, *, w4)
n_xxxx counts
(*, *, *, *)
, i.e. any quadgram
- classmethod chi_sq(*marginals)[source]¶
Scores ngrams using Pearson’s chi-square as in Manning and Schutze 5.3.3.
- classmethod likelihood_ratio(*marginals)[source]¶
Scores ngrams using likelihood ratios as in Manning and Schutze 5.3.4.
- static mi_like(*marginals, **kwargs)[source]¶
Scores ngrams using a variant of mutual information. The keyword argument power sets an exponent (default 3) for the numerator. No logarithm of the result is calculated.