duckling_old
duckling_old copied to clipboard
Probabilistically detecting the language that the user entered
It is nice that duckling supports languages but I thought it might be even more awesome if the language could be determined during input. i.e. some indicators if the input is in a certain language. This way the user wouldn't need to bother in which language to enter the text.
Additionally it would be good to have a "preferred language" (option). Example: A date is not clearly english or german but a former input of the user (session data) was english then it english would be preferred.
Hey @martinheidegger,
You're absolutely right, we could detect the input language automatically. However, we feel that it is a different concern, and deserves a separate system on its own. To be clear, we're not trying to build a full-fledged NLP stack here, but just a brick of a bigger system. A simple machine-learning or heuristic-based component could sit in front of Duckling and feed it the language parameter automatically.
Regarding your second point, it will be possible in the future to influence the analysis by giving 'assumptions' as inputs (user timezone, language+locale, user location, etc.).
Look at what just popped out: https://github.com/wooorm/franc
For us Clojurians this is better: https://github.com/dakrone/cld