tesseract.js
tesseract.js copied to clipboard
Browser workers unable to process Blobs
Describe the bug
Tesseract.js for the browser errors while attempting to process a Blob
, despite the documentation stating that Blob
s are supported.
To Reproduce Steps to reproduce the behavior:
index.html (single file example)
<!DOCTYPE html>
<html>
<head>
<script src='https://unpkg.com/[email protected]/dist/tesseract.min.js'></script>
<script>
fetch('https://i.imgur.com/DWPz2JA.png').then((response) => {
return response.blob()
}).then((blob) => {
Tesseract.recognize(blob, 'eng', {logger: m => console.log(m)}).then(({ data: { text } }) => {
console.log(text);
});
});
</script>
</head>
</html>
- Save the above content as an html file
- Open the file locally in Chrome/Firefox
- Wait for Tesseract.js to initialize
- See error
Expected behavior Tesseract.js will begin attempting to recognize text in the image
Screenshots An error occurs. Screenshot from Chrome 88.0.4324.150 (Official Build) (64-bit)
Screenshot from Firefox 85.0 (64-bit)
Desktop (please complete the following information):
- Browsers
- Chrome 88.0.4324.150 (Official Build) (64-bit)
- Firefox 86.0 (64-bit))
Additional context
The issue appears to be caused by https://github.com/naptha/tesseract.js/commit/6481256f5eeecbfaaec7984415c24c1d075ba50f, a fix for another issue. Blob
s do not have a name
property. According to the commit tags, it was introduced around 1.2.1.
Problem code (from latest commit, as of writing): https://github.com/naptha/tesseract.js/blob/90466c3b5504a9220ba0ff91ccec22003f72cbd2/src/worker/browser/loadImage.js#L79-L81
Having the same trouble as well, I find they way to bypass it is to just assign something to the blob.name
before passing it in. Then everything just seems works fine.
A clean and easy workaround is to create a File object from the blob and assign a name.
@Smiley43210 Thanks for flagging this issue, as well as providing an explanation of the root cause and reproducible example. The commit you identify as problematic has been rolled back in the master branch (as well as version 3), and I was able to run your example without issues in the current version. Therefore, this should be resolved.