repodiac

Results 2 repositories owned by repodiac

german_transliterate

35
Stars
19
Forks
35
Watchers

Python module to clean and transliterate (i.e. normalize) German text including abbreviations, numbers, timestamps etc. It can be used to clean messy text (e.g. map peculiar Unicode encodings to ASCII...

german_compound_splitter

34
Stars
6
Forks
34
Watchers

Compound splitter for German language ("Komposita-Zerlegung") based on large dictionary combined with highly efficient multi-pattern string search