RapidFuzz
RapidFuzz copied to clipboard
Rapid fuzzy string matching in Python using various string metrics
In relation to issue #180: It would be great to generalize the foreseen implementation of the Dameau-Levenshtein distance to weights depending not only on the nature of the operation (insert/delete/substitute/transpose),...
add a pre release test, which tests whether the submodules `extern/rapidfuzz-cpp` and `extern/jarowinkler-cpp` are using the newest tag available.
The pure Python implementation still misses the following parts: - [ ] Levenshtein.editops - [ ] Levenshtein.opcodes - [ ] Indel.editops - [ ] Indel.opcodes - [ ] LCSseq.editops -...
I'm currently trying to package an application that uses rapidfuzz using cx_freeze. The packaging is successful but when I try to run the application I get the following error. ```implementation...
error code below ``` × Building wheel for rapidfuzz (pyproject.toml) did not run successfully. │ exit code: 1 ╰─> [1852 lines of output] Not searching for unused variables given on...
Currently the process module has the following functions: | function | kind | explanation | |------------|--------|-----------------| | extractOne | one x many | returns the best match as (choice, score,...
A banded version of the Levenshtein distance algorithm should be implemented as described in https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.142.1245&rep=rep1&type=pdf. This would reduce the runtime of `string_metric.levenshtein` from `O(N/64 * M)` to `O(score_cutoff/64 * M`...
Since some of the processor functions can run for a long time. However it is currently not possible to quit by pressing Ctrl+C. Instead it is required to manually kill...
All the algorithms in the process module should be fairly simple to run in parallel.