python-Levenshtein icon indicating copy to clipboard operation
python-Levenshtein copied to clipboard

Q: restrict operations

Open kingjr opened this issue 7 years ago • 1 comments

Hi,

is it possible to restrict the matching to some operations, e.g.

# default
changes = editops(A, B, operations=('insert', 'delete', 'replace'))
# as opposed to
changes = editops(A, B, operations=('insert', 'delete'))
# or
changes = editops(A, B, operations=('replace', 'delete'))

thanks!

kingjr avatar Sep 18 '17 22:09 kingjr

This is not possible in python-Levenshtein, but at least one of them can be achieved in https://github.com/maxbachmann/RapidFuzz:

changes = editops(A, B, operations=('insert', 'delete', 'replace'))

changes = rapidfuzz.distance.Levenshtein.editops(A, B)

changes = editops(A, B, operations=('insert', 'delete'))

changes = rapidfuzz.distance.Indel.editops(A, B)

changes = editops(A, B, operations=('replace', 'delete'))

This is not possible and I am unsure how it would be implemented. I plan to add something along the lines of: changes = rapidfuzz.distance.Levenshtein.editops(A, B, weights=(1,1,1)). This would allow you the make insertions very expensive, so they would be avoided if possible. However, when Insertions are not allowed, not all sequences could be converted. E.g.:

editops("test", "teste", operations=('replace', 'delete'))

would not be possible, since it will always require an Insertion in the first string.

maxbachmann avatar Feb 10 '22 13:02 maxbachmann