stringsearchalgorithms icon indicating copy to clipboard operation
stringsearchalgorithms copied to clipboard

More efficient Tries/DAWGs

Open almondtools opened this issue 6 years ago • 0 comments

The current trie actually is a directed acyclic word graph (DAWG), yet we have algorithms that do not depend on this:

  • Set-Horspool and Wu-Manber need simple tries (can be implemented more efficiently as double array)
  • Aho-Corasick needs a trie with support links (can also be implemented as double array, but needs additional structure for support links
  • (Set) Backward Oracle Matching uses a factor oracle base on a DAWG, so we will need this implementation at least here

Consequently each algorithm should use the best optimized variant instead of all depending on a DAWG (which is currently labeled trie).

almondtools avatar Dec 17 '18 21:12 almondtools