Wojciech Muła

Results 94 comments of Wojciech Muła

@zhu Thanks, that's important. So, a safe way would be to use UCS-32 (4 bytes per code point). Which is not very memory-usage friendly.

@zhu Thank a lot. It is more than strange. So there is no perfect solution.

Thank you @pombredanne for the explanation. I always thought that Unicode is just for assigning the numbers to characters. :)

@asmithatlassian Thank you for your kind words.

There was an attempt to split the module into py2 and py3 lines. But that approach failed, as the code was duplicated and there was a lot of burdens with...

AhoCorasick automaton is a simple DFA, thus during visiting nodes we would need to save somewhere the whole path which lead to the current, accepting state. At first glance it...

@Pehat I'm not sure if simple backtrace would do the job, because an accepting node might be reached via different paths. This is why I suggested that the matched string...

I was thinking about this and to my knowledge it's simply not possible without extra memory overhead. When you are at given node and going down the trie, you're simply...

So, seems to be doable at C level. But I don't want to increase memory consumption, as the structure is already big. I'd rather shrink nodes. An option would be...

@gladtosee To be honest I wasn't aware of this problem, you are the first one mentioning it. I need to learn a little bit about this issue. Thanks for these...