duckling_old icon indicating copy to clipboard operation
duckling_old copied to clipboard

Probabilistically detecting the language that the user entered

Open martinheidegger opened this issue 10 years ago • 3 comments

It is nice that duckling supports languages but I thought it might be even more awesome if the language could be determined during input. i.e. some indicators if the input is in a certain language. This way the user wouldn't need to bother in which language to enter the text.

Additionally it would be good to have a "preferred language" (option). Example: A date is not clearly english or german but a former input of the user (session data) was english then it english would be preferred.

martinheidegger avatar Oct 02 '14 04:10 martinheidegger

Hey @martinheidegger,

You're absolutely right, we could detect the input language automatically. However, we feel that it is a different concern, and deserves a separate system on its own. To be clear, we're not trying to build a full-fledged NLP stack here, but just a brick of a bigger system. A simple machine-learning or heuristic-based component could sit in front of Duckling and feed it the language parameter automatically.

Regarding your second point, it will be possible in the future to influence the analysis by giving 'assumptions' as inputs (user timezone, language+locale, user location, etc.).

blandinw avatar Oct 02 '14 09:10 blandinw

Look at what just popped out: https://github.com/wooorm/franc

blandinw avatar Oct 03 '14 13:10 blandinw

For us Clojurians this is better: https://github.com/dakrone/cld

ar7hur avatar Oct 03 '14 13:10 ar7hur