Game2Text icon indicating copy to clipboard operation
Game2Text copied to clipboard

Tesseract Using Environment Variable's Path In Windows Instead Of Bundled Path

Open ryuga93 opened this issue 3 years ago • 2 comments

Hi, in the latest version, the Tesseract engine will use the Path set in Environment Variable instead of the path from the bundle, causing it to throw error (or OCR not working in the release version).

pytesseract.pytesseract.TesseractError: (1, 'Error opening data file C:\\Users\\PC1\\Downloads\\Tesseract-OCR\\jpn.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'jpn\' Error opening data file C:\\Users\\PC1\\Downloads\\Tesseract-OCR\\eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'eng\' Tesseract couldn\'t load any languages! Could not set option: use_new_state_cost=F Could not set option: segment_segcost_rating=F Could not set option: enable_new_segsearch=0 Could not initialize tesseract.') 2021-07-18T20:55:47Z <Greenlet at 0x1b188676048: _process_message({'call': 14.888897478777801, 'name': 'recognize_im, <geventwebsocket.websocket.WebSocket object at 0x0)> failed with TesseractError

The possible bug is in https://github.com/mathewthe2/Game2Text/blob/50cf52cfdbc911daa52a71bd136b919e32a9e718/tools.py#L54

where the Windows branch does not return a proper tessdata-dir path value. Adding a return seems to fix this problem for me.

ryuga93 avatar Jul 18 '21 21:07 ryuga93

Which Tesseract version are you referring to?

And what do you mean by proper tessdata-dir path? Did you export the TESSDATA_PREFIX environment variable manually?

mathewthe2 avatar Aug 17 '21 05:08 mathewthe2

I have 5.0 installed in my machine, so in my environment variable path setting, it is set to my installation folder for my own project use. By proper tessdata-dir path I mean the bundled path, ie the Tesseract bundled together with the executable.

In the Darwin branch there is a return statement for it, https://github.com/mathewthe2/Game2Text/blob/50cf52cfdbc911daa52a71bd136b919e32a9e718/tools.py#L44

so I figured that Windows needs it's own return statement too, and added return '--tessdata-dir {}'.format('%r'%str(Path(WIN_TESSERACT_DIR, "tessdata")))

after the line https://github.com/mathewthe2/Game2Text/blob/50cf52cfdbc911daa52a71bd136b919e32a9e718/tools.py#L53

in which my compilation worked.

ryuga93 avatar Aug 17 '21 06:08 ryuga93