aho-corasick icon indicating copy to clipboard operation
aho-corasick copied to clipboard

Fix leftMostLongestMatch returning all overlapping matches

Open semvis123 opened this issue 11 months ago • 2 comments

https://github.com/petar-dambovaliev/aho-corasick/pull/13 Caused overlapping patterns to return multiple matches. This is fixed by only using the position new calculation for matches that are being cancelled due to MatchOnlyWholeWords.

Edit: Just realized that this only accounts for the cases where the begin position is different. (so, this is not a complete fix) Example case where it will still fail:

func TestOverlappingPatterns4(t *testing.T) {
	trieBuilder := NewAhoCorasickBuilder(Opts{
		MatchOnlyWholeWords: true,
		MatchKind:           LeftMostLongestMatch,
		DFA:                 false,
	})

	patterns := []string{"testing", "testing 123"}

	trie := trieBuilder.Build(patterns)
	result := trie.FindAll("testing 12345")
	if len(result) != 1 {
		t.Logf("%v", result)
		t.Error("Did not find match in string")
		t.FailNow()
	}
}

semvis123 avatar Jul 27 '23 20:07 semvis123