colibri-core
colibri-core copied to clipboard
Colibri core is an NLP tool as well as a C++ and Python library for working with basic linguistic constructions such as n-grams and skipgrams (i.e patterns with one or more gaps, either of fixed or dy...
Complicated by the fact that everything is now a patternpointer. Looking for the best approach to tackle this..
(Used by new filterset mechanism)
Hi, To begin with, Thank you.. For the amazing work you've done so far.. I have a few questions regarding my usage of colibric-core in my project What I am...
Discovered by @fkunneman; output file was only 2-bytes (the initial null byte and version marker). Input text was just: ```prachtig apparaat en droogt goed kreukelvrij fijn de verlichting binnenin voelt...
(requested by Gabor Toth)
The following library could be pluggable into our current framework: *STXXL implements containers and algorithms that can process huge volumes of data that only fit on disks.*: http://stxxl.sourceforge.net/
Would it be possible to load copora with mmap? This would make it possible to work with corpora larger than the available RAM, and is much more efficient if only...
Currently, Colibri Core only extracts skipgrams in which the skip is not at an initial or final position, but in the middle. For example, patterns like `x {*}` and `{*}...