acora icon indicating copy to clipboard operation
acora copied to clipboard

Building mildly deep automatons takes a long time

Open pombredanne opened this issue 9 years ago • 2 comments

With this snippet and the latest 2.0, which creates an automaton with 1000 strings of 2000 characters each build() takes forever to complete, I eventually killed it:

>>> from array import array
>>> from acora import AcoraBuilder
>>> tks =[array('h', range(x, x+1000)).tostring() for x in range(1000)]
>>> builder = AcoraBuilder(*tks)
>>> ac=builder.build()

pombredanne avatar Jun 24 '16 11:06 pombredanne

note this is a follow up on #6

pombredanne avatar Jun 24 '16 12:06 pombredanne

FWIW, the building of an automaton in @WojciechMula 's https://github.com/WojciechMula/pyahocorasick/blob/master/py/pyahocorasick.py (not even the C implementation) is much much faster.

pombredanne avatar Jul 02 '16 00:07 pombredanne