Auto detect language?
I'm using Blackbox (https://chrome.google.com/webstore/detail/blackbox-select-copy-past/mcgbeeipkmelnpldkobichboakdfaeon) for screen text extractor, it's auto detect any langauge, I just active it, drag region to capture text and done, almost any language are supported (English, Vietnamese, Japanese, Chinese, Thai....) and also it could detect multiple language at the same time. It's work great compare to this. Is there any way our library could support some thing similar?
This program uses Windows 10's built-in OCR functionality. It requires you to specify a language before you can do OCR.
In addition, to recognize text in a specific language, the OCR component for that language has to be installed on your device.
For example, if your system language is English, but you want it to recognize Japanese text, you would have to install Japanese language first: System Settings > Time & language > Language > Add a language > Japanese. Language packs, text-to-speech and handwriting components are not required. After that, Japanese will be available as an OCR language.
Some OCR languages, for example Chinese, support English text as well, so it can recognize Chinese text mixed with English words.
Auto detect language would be awesome, but I am not aware of a good C# OCR library which enables this. Do you know what OCR tool Blackbox is using?
I am going to close this issue for now since it is out of scope of what the Windows OCR API is capable of. However, I am working on enabling Tesseract to be used instead of the Windows API which could allow for multiple language detection, but that will be tracked on a different issue.