libfsm icon indicating copy to clipboard operation
libfsm copied to clipboard

Use Aho-Corasick for alts of literals during regexp AST compilation

Open katef opened this issue 4 years ago • 1 comments

I'll have more to say about this, but I'm too tired to write about it at the moment.

katef avatar Jul 19 '20 06:07 katef

It's difficult to see the structure for unanchored strings (because we have no way to inform graphviz to weight the backwards edges differently).

But here's the simpler case for anchored strings, showing 100 alts of {3,20} characters in length. The same (randomly generated) data for each. Hopefully you can see the difference in the structure, with the trie shape visible.

Recursive Thompson construction Aho-Corasick
image image

katef avatar Jul 19 '20 06:07 katef