tesseract.js
tesseract.js copied to clipboard
Failed to dectet OS
Describe the bug A clear and concise description of what the bug is.
To Reproduce
Using Pure JS+HTML like this:
const exampleImage = some.png;
const worker = Tesseract.createWorker({
logger: m => console.log(m)
});
Tesseract.setLogging(true);
work();
async function work() {
await worker.load();
await worker.loadLanguage('eng');
await worker.initialize('eng');
let result = await worker.detect(exampleImage);
console.log(result.data);
result = await worker.recognize(exampleImage);
console.log(result.data);
await worker.terminate();
}
Expected behavior Console log "Failed to dectet OS"
Screenshots
Desktop (please complete the following information):
- OS: Windows10
- Browser Chrome
- Version 86
Additional context Source PNG file URL: http://cn8.frp.cool:12385/upload/1616768763_377_123.png
I am encountering this too, but with Firefox 91.
Same issue for me on chrome and firefox. Current version of tesseract.js, detect method does not work.
Also, it fails no matter the source. Video, canvas, image and base64 all fail.
I am unable to reproduce this issue (I ran the code snippet above and it worked for me), and the image linked above is dead. For anybody with this issue, please clarify whether the demo site runs for you. That should help us figure out whether the issue stems from the particulars of your browser/system or how you are deploying tesseract.js.
https://tesseract.projectnaptha.com/
@JobberRT @munsterlander I looked more into this, and believe I understand the issue. In the same way that Tesseract does not always detect text, it does not always detect script/orientation. When running on an image with only a couple words, it will not detect script/orientation, and tesseract.js throws an error.
The fact that Tesseract does not recognize script/orientation on such images is outside of the scope of this project (as we do not edit the Tesseract engine). However, throwing an error when this happens does not seem like the correct behavior. Presumably tesseract.js should simply return a null value rather than throwing an exception (similar to if you run recognize
on a page with no text).
I edited so detect
now returns null
values when OS detection is not possible rather than throwing an error and killing the API. As this is technically a breaking change, it was implemented in the dev/v4 branch, and will be included with the next major release (v4). To learn more about changes in v4 see Issue #662.
@Balearica Thanks! I will look into #662 and check it!