lingua-go
lingua-go copied to clipboard
Detect multiple languages in mixed-language text
Currently, for a given input string, only the most likely language is returned. However, if the input contains contiguous sections of multiple languages, it will be desirable to detect all of them and return an ordered sequence of items, where each item consists of a start index, an end index and the detected language.
Input: He turned around and asked: "Entschuldigen Sie, sprechen Sie Deutsch?"
Output:
[
{"start": 0, "end": 27, "language": ENGLISH},
{"start": 28, "end": 69, "language": GERMAN}
]
This would be quite a useful feature!