tokenizer topic
graphql-query-compress
Compress GraphQL Query String
xontrib-output-search
Get identifiers, paths, URLs and words from the previous command output and use them for the next command in xonsh shell.
ilmulti
Tooling to play around with multilingual machine translation for Indian Languages.
hebrew_tokenizer
A field-tested Hebrew tokenizer for dirty texts (ben-yehuda project, bible, cc100, mc4, opensubs, oscar, twitter) focused on multi-word expression extraction.
nlp-js-tools-french
POS Tagger, lemmatizer and stemmer for french language in javascript
ArabicProcessingCog
A Python package that do stemming, tokenization, sentence breaking, segmentation, normalization, POS tagging for Arabic language.
pascal-interpreter
A simple interpreter for a large subset of Pascal language written for educational purposes
lex
Lex is an implementation of lex tool in Ruby.
mystem-scala
Morphological analyzer `mystem` (Russian language) wrapper for JVM languages
Hebrew-Tokenizer
A very simple python tokenizer for Hebrew text.