nltk.metrics.edit_distance_align¶
- nltk.metrics.edit_distance_align(s1, s2, substitution_cost=1)[source]¶
Calculate the minimum Levenshtein edit-distance based alignment mapping between two strings. The alignment finds the mapping from string s1 to s2 that minimizes the edit distance cost. For example, mapping “rain” to “shine” would involve 2 substitutions, 2 matches and an insertion resulting in the following mapping: [(0, 0), (1, 1), (2, 2), (3, 3), (4, 4), (4, 5)] NB: (0, 0) is the start state without any letters associated See more: https://web.stanford.edu/class/cs124/lec/med.pdf
In case of multiple valid minimum-distance alignments, the backtrace has the following operation precedence:
Skip s1 character
Skip s2 character
Substitute s1 and s2 characters
The backtrace is carried out in reverse string order.
This function does not support transposition.
- Parameters
s2 (str) – The strings to be aligned
- Return type
List[Tuple(int, int)]