nltk.collocations.BigramCollocationFinder¶
- class nltk.collocations.BigramCollocationFinder[source]¶
Bases:
AbstractCollocationFinder
A tool for the finding and ranking of bigram collocations or other association measures. It is often useful to use from_words() rather than constructing an instance directly.
- default_ws = 2¶
- __init__(word_fd, bigram_fd, window_size=2)[source]¶
Construct a BigramCollocationFinder, given FreqDists for appearances of words and (possibly non-contiguous) bigrams.
- classmethod from_words(words, window_size=2)[source]¶
Construct a BigramCollocationFinder for all bigrams in the given sequence. When window_size > 2, count non-contiguous bigrams, in the style of Church and Hanks’s (1990) association ratio.
- score_ngram(score_fn, w1, w2)[source]¶
Returns the score for a given bigram using the given scoring function. Following Church and Hanks (1990), counts are scaled by a factor of 1/(window_size - 1).
- above_score(score_fn, min_score)[source]¶
Returns a sequence of ngrams, ordered by decreasing score, whose scores each exceed the given minimum score.
- apply_freq_filter(min_freq)[source]¶
Removes candidate ngrams which have frequency less than min_freq.
- apply_ngram_filter(fn)[source]¶
Removes candidate ngrams (w1, w2, …) where fn(w1, w2, …) evaluates to True.
- apply_word_filter(fn)[source]¶
Removes candidate ngrams (w1, w2, …) where any of (fn(w1), fn(w2), …) evaluates to True.