ahocorasick icon indicating copy to clipboard operation
ahocorasick copied to clipboard

Doesn't return index of keywords found in text

Open issamemari opened this issue 3 years ago • 1 comments

I've noticed that the implementation doesn't return the index of where keywords were found in text. This forces the user to do another search for the keyword to find its index, while the Aho Corasick algorithm should be able to provide this information for no extra cost.

I've made several modifications to the implementation in my fork https://github.com/issamemari/ahocorasick, among which is making the algorithm return the index the index. I'm happy to submit a PR that includes only the changes related to this.

issamemari avatar Nov 10 '21 23:11 issamemari

Agreed that's a very useful feature to have.

I've just tried out your fork @issamemari , and it was unfortunately on my test data it was 10x slower than this one (330ms vs 30ms). The baseline solution of looping over strings.Index() was at 180ms.

dbolkensteyn avatar Mar 06 '22 18:03 dbolkensteyn