tesseract.js icon indicating copy to clipboard operation
tesseract.js copied to clipboard

Browser workers unable to process Blobs

Open minimusubi opened this issue 3 years ago • 2 comments

Describe the bug Tesseract.js for the browser errors while attempting to process a Blob, despite the documentation stating that Blobs are supported.

To Reproduce Steps to reproduce the behavior:

index.html (single file example)

<!DOCTYPE html>
<html>
	<head>
		<script src='https://unpkg.com/[email protected]/dist/tesseract.min.js'></script>
		<script>
			fetch('https://i.imgur.com/DWPz2JA.png').then((response) => {
				return response.blob()
			}).then((blob) => {
				Tesseract.recognize(blob, 'eng', {logger: m => console.log(m)}).then(({ data: { text } }) => {
					console.log(text);
				});
			});
		</script>
	</head>
</html>
  1. Save the above content as an html file
  2. Open the file locally in Chrome/Firefox
  3. Wait for Tesseract.js to initialize
  4. See error

Expected behavior Tesseract.js will begin attempting to recognize text in the image

Screenshots An error occurs. Screenshot from Chrome 88.0.4324.150 (Official Build) (64-bit) image

Screenshot from Firefox 85.0 (64-bit) image

Desktop (please complete the following information):

  • Browsers
    • Chrome 88.0.4324.150 (Official Build) (64-bit)
    • Firefox 86.0 (64-bit))

Additional context The issue appears to be caused by https://github.com/naptha/tesseract.js/commit/6481256f5eeecbfaaec7984415c24c1d075ba50f, a fix for another issue. Blobs do not have a name property. According to the commit tags, it was introduced around 1.2.1.

Problem code (from latest commit, as of writing): https://github.com/naptha/tesseract.js/blob/90466c3b5504a9220ba0ff91ccec22003f72cbd2/src/worker/browser/loadImage.js#L79-L81

minimusubi avatar Feb 24 '21 06:02 minimusubi

Having the same trouble as well, I find they way to bypass it is to just assign something to the blob.name before passing it in. Then everything just seems works fine.

ICE-Cold-Ethanol avatar Apr 29 '21 02:04 ICE-Cold-Ethanol

A clean and easy workaround is to create a File object from the blob and assign a name.

IggWeb avatar Nov 30 '21 14:11 IggWeb

@Smiley43210 Thanks for flagging this issue, as well as providing an explanation of the root cause and reproducible example. The commit you identify as problematic has been rolled back in the master branch (as well as version 3), and I was able to run your example without issues in the current version. Therefore, this should be resolved.

Balearica avatar Aug 22 '22 04:08 Balearica