Game2Text
Game2Text copied to clipboard
Tesseract Using Environment Variable's Path In Windows Instead Of Bundled Path
Hi, in the latest version, the Tesseract engine will use the Path set in Environment Variable instead of the path from the bundle, causing it to throw error (or OCR not working in the release version).
pytesseract.pytesseract.TesseractError: (1, 'Error opening data file C:\\Users\\PC1\\Downloads\\Tesseract-OCR\\jpn.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'jpn\' Error opening data file C:\\Users\\PC1\\Downloads\\Tesseract-OCR\\eng.traineddata Please make sure the TESSDATA_PREFIX environment variable is set to your "tessdata" directory. Failed loading language \'eng\' Tesseract couldn\'t load any languages! Could not set option: use_new_state_cost=F Could not set option: segment_segcost_rating=F Could not set option: enable_new_segsearch=0 Could not initialize tesseract.') 2021-07-18T20:55:47Z <Greenlet at 0x1b188676048: _process_message({'call': 14.888897478777801, 'name': 'recognize_im, <geventwebsocket.websocket.WebSocket object at 0x0)> failed with TesseractError
The possible bug is in https://github.com/mathewthe2/Game2Text/blob/50cf52cfdbc911daa52a71bd136b919e32a9e718/tools.py#L54
where the Windows branch does not return a proper tessdata-dir path value. Adding a return seems to fix this problem for me.
Which Tesseract version are you referring to?
And what do you mean by proper tessdata-dir path? Did you export the TESSDATA_PREFIX environment variable manually?
I have 5.0 installed in my machine, so in my environment variable path setting, it is set to my installation folder for my own project use. By proper tessdata-dir path I mean the bundled path, ie the Tesseract bundled together with the executable.
In the Darwin branch there is a return statement for it, https://github.com/mathewthe2/Game2Text/blob/50cf52cfdbc911daa52a71bd136b919e32a9e718/tools.py#L44
so I figured that Windows needs it's own return statement too, and added
return '--tessdata-dir {}'.format('%r'%str(Path(WIN_TESSERACT_DIR, "tessdata")))
after the line https://github.com/mathewthe2/Game2Text/blob/50cf52cfdbc911daa52a71bd136b919e32a9e718/tools.py#L53
in which my compilation worked.