tsakorpus
tsakorpus copied to clipboard
Add parameter for a regex to tokenize kw fields
We've got customized tokenization for text fields like lemma and it works fine (e.g. no tokenization by dot) Now a similar setting is needed for keyword fields like INEL SeR, SyF etc. (e.g. to force tokenizing by space)