aho-corasick
aho-corasick copied to clipboard
High memory usage compared to other implementation
Hi, I found this while looking for a lower memory usage alternative to anknown/ahocorasick.
I have a dataset of around 6 million strings. The total memory usage, as shown by pprof, after building the automaton is just over 30GB, compared to 6.5GB for the anknown version.
Do you have any tips for working out why it's using so much more RAM?
Thanks in advance.
Hi, I found this while looking for a lower memory usage alternative to anknown/ahocorasick.
I have a dataset of around 6 million strings. The total memory usage, as shown by pprof, after building the automaton is just over 30GB, compared to 6.5GB for the anknown version.
Do you have any tips for working out why it's using so much more RAM?
Thanks in advance.
Hey, sorry for the late response, lol. It's been 2 years, more or less.
I haven't had much time for open source.
I am not familiar with the implementation of anknown
.
I'll need to check it out before making some kind of a statement.
Hi, I found this while looking for a lower memory usage alternative to anknown/ahocorasick.
I have a dataset of around 6 million strings. The total memory usage, as shown by pprof, after building the automaton is just over 30GB, compared to 6.5GB for the anknown version.
Do you have any tips for working out why it's using so much more RAM?
Thanks in advance.
I will also need to analyse your data.