RapidFuzz
RapidFuzz copied to clipboard
add SIMD support for long sequences
for sequences with lengths over 64 characters it would still be possible to calculate the similarity for multiple sequences in parallel using simd. However for very long sequences it might be faster to compare individual sequences especially when a score_cutoff
is specified
This has a couple of problems:
- depending on the metric it can be hard to implement, since the algorithms behavior depends on the individual string lengths
- many of the algorithms use ukkonen bands to improve the runtime when the user provides a score_cutoff. Since the ukkonen bands depend on the string lengths it's not really possible to use both of them. Depending on the user provided score cutoff, this can provide a much larger speedup.