Max Bachmann

Results 300 comments of Max Bachmann
trafficstars

I was wondering why my own implementation (https://github.com/maxbachmann/RapidFuzz) was faster for very dissimilar sequences at some point. And indeed similar to this PR I store the deltas separately, so I...

> Yup, for smaller sequences the housekeeping becomes too much. I think the best approach is to use different algorithm (like LandauVishkin) for smaller sequences. Idea was to put this...

A common use case for things like search engines is to generate the tree once and then use it for searches. For this reason this should support some kind of...

In the same context Levenshtein automatons could be interesting as well: https://dmice.ohsu.edu/bedricks/courses/cs655/pdf/readings/2002_Schulz.pdf

This has a couple of problems: 1) depending on the metric it can be hard to implement, since the algorithms behavior depends on the individual string lengths 2) many of...

Note that I am unsure how simple/hard romanisation is depending on the language, since I have zero experience with languages that need this sort of preprocessing. So any solution making...

I think a documentation section on options for romanisation for different languages would make sense. It is a fairly common thing people run into when matching non roman-languages and so...

Glad to hear that the library proves to be useful :) Do I understand it correctly, that you have essentially two columns `A` and `B`, where in each row you...

Not in depth. However I just had a quick look at their two benchmarks: ## Long sequences This benchmark is mentioned in the readme and mentions them being significantly faster...

I am really still not sure what to do in these cases. One basic issue right now is that when processing very long sequences there is not really any way...