paraglob icon indicating copy to clipboard operation
paraglob copied to clipboard

Keep Multifast Compiled Across Serializations

Open 0xekez opened this issue 5 years ago • 0 comments

Presently, the serialization function only serializes the vector of patterns contained inside a paraglob. For unserializing, a new paraglob is built from that serialized vector of patterns, and its aho-corasick structure is recompiled. This recompilation is expensive though and can take as long as 10 seconds for very long pattern sets.

It would be very nice to be able to serialize a paraglob in such a way that it doesn't need to be recompiled after being unserialized. This is fairly difficult though because of the complexity of the aho-corasick trie inside after it has been compiled and the fact that its memory isn't stored contiguously.

0xekez avatar May 23 '19 00:05 0xekez