whatlanggo
whatlanggo copied to clipboard
Not detecting language in large text
Example https://play.golang.org/p/qupLXwVQc4m
First example is a large text in English. The library can't produce confident result - confidence is negative. Second example is a couple of sentences from the same text. The result is correct. It doesn't matter in which language the text is. After certain threshold it will always break.
I checked https://github.com/kapsteur/franco that seems to be using the same model and trigrams. It works.
Thanks @creker. I'm looking into it