Sequence modeling

Language models

Vocabulary

Stores language model vocabulary.

NgramCounter

Class for counting ngrams.

MLE

Class for providing MLE ngram model scores.

Lidstone

Provides Lidstone-smoothed scores.

Laplace

Implements Laplace (add one) smoothing.

WittenBellInterpolated

Interpolated version of Witten-Bell smoothing.

KneserNeyInterpolated

Interpolated version of Kneser-Ney smoothing.

AbsoluteDiscountingInterpolated

Interpolated version of smoothing with absolute discount.

StupidBackoff

Provides StupidBackoff scores.

Translation

AlignedSent

Return an aligned sentence object, which encapsulates two sentences along with an Alignment between them.

Alignment

A storage class for representing alignment between two sequences, s1, s2.

PhraseTable

In-memory store of translations for a given phrase, and the log probability of the those translations

IBMModel

Abstract base class for all IBM models

IBMModel1

Lexical translation model that ignores word order

IBMModel2

Lexical translation model that considers word order

IBMModel3

Translation model that considers how a word can be aligned to multiple words in another language

IBMModel4

Translation model that reorders output words based on their type and their distance from other related words in the output sentence

IBMModel5

Translation model that keeps track of vacant positions in the target sentence to decide where to place translated words

StackDecoder

Phrase-based stack decoder for machine translation

trace(backlinks, source_sents_lens, ...)

Traverse the alignment cost from the tracebacks and retrieves appropriate sentence pairs.

grow_diag_final_and(srclen, trglen, e2f, f2e)

This module symmetrisatizes the source-to-target and target-to-source word alignment output and produces, aka.

extract(f_start, f_end, e_start, e_end, ...)

This function checks for alignment point consistency and extracts phrases using the chunk of consistent phrases.