pyahocorasick icon indicating copy to clipboard operation
pyahocorasick copied to clipboard

Python module (C extension and plain python) implementing Aho-Corasick algorithm

Results 46 pyahocorasick issues
Sort by recently updated
recently updated
newest added

Version: 1.4 Python 2.7.15 class TEST(): def __init__(self, input_filename): self.ac = ahocorasick.Automaton() n_word = 0 with open(input_filename) as f: for text in f: n_word += 1 word = text.strip() self.ac.add_word(word,...

bug

demo code: ____________________________________________________________________________________________________ import os import psutil import ahocorasick def build_automaton(): automation = ahocorasick.Automaton() for i in range(2000000): automation.exists(str(i)) def show_used_memory(): print('memory used: {} M'.format(psutil.Process(os.getpid()).memory_info().rss / (1024. ** 2))) if...

bug

``` gcc -pthread -Wno-unused-result -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -fPIC -DAHOCORASICK_UNICODE= -I/home/pombreda/pyahocorasick/venv/include -I/home/pombreda/.pyenv/versions/3.6.10/include/python3.6m -c pyahocorasick.c -o build/temp.linux-x86_64-3.6/pyahocorasick.o In file included from Automaton.c:1201:0, from pyahocorasick.c:29: Automaton_pickle.c: In function ‘automaton_unpickle’: Automaton_pickle.c:363:17:...

Travis supported this. It is not clear if GH actions do support this

enhancement

Hi, thanks for the great work! I am wondering if case-insensitive string match is supported. For example, when there is "information system" in the built Trie, and it can match...

enhancement
help-wanted

Hello! Thank you for the great library. I need to search multiple keys stored in the automaton in the input string. The number of keys is big and they overlap...

question

Hi, I'm just starting to discover and use your library to extract terms defined in a thesaurus from an input text, and "highlight" them in a HTML output. It works...

enhancement

An attempt to solve #102. What was done: instead of creating pickle data before pickling, we create an iterator which was meant to yield small portions of data on demand....

Hi, is there a way to determine prepared Automaton's memory footprint? It could be helpful for using this in limited size cache. Thank you.