normcap
normcap copied to clipboard
[Bug] Crash when trying to add 'fraktur' languages
What happened?
I'm very happy with normcap, it's an excellent tool for simple OCR jobs (like for memes which I can then easier translate), but I wanted to add both Danish and German and each of these languages have a "fraktur" version available when adding languages.
But when I try to download either of these I get an error.
I can see in my HOSTS filter that the 404 error is not because one of my filters are blocking the address at least, so not sure what it could be. Parsing error perhaps?
How did you install NormCap?
Flatpak (Flathub)
Operating System + Version?
Nobara 42 (Fedora 42 based)
[Linux only] Display Server (DS) + Desktop environment (DE)?
Wayland + KDE Plasma
Debug log output?*
14:30:37 - WARNING - normcap.gui.dbus:160 - Failed to move window via org.kde.kwin.Scripting!
14:30:37 - WARNING - normcap.gui.dbus:160 - Failed to move window via org.kde.kwin.Scripting!
14:30:49 - ERROR - normcap.gui.downloader:57 - Could not download 'https://github.com/tesseract-ocr/tessdata_fast/raw/4.1.0/dan_frak.traineddata'
Traceback (most recent call last):
File "/app/lib/python3.11/site-packages/normcap/gui/downloader.py", line 48, in run
with urlopen( # noqa: S310
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 216, in urlopen
return opener.open(url, data, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 525, in open
response = meth(req, response)
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 634, in http_response
response = self.parent.error(
^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 563, in error
return self._call_chain(*args)
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 496, in _call_chain
result = func(*args)
^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 643, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
14:30:49 - CRITICAL - normcap:148 - Uncaught exception!
Traceback (most recent call last):
File "/app/lib/python3.11/site-packages/normcap/gui/language_manager.py", line 95, in _on_download_error
QtWidgets.QMessageBox.critical(
TypeError: PySide6.QtWidgets.QMessageBox.critical(): not enough arguments. Note: keyword arguments are only supported for optional parameters.
14:30:49 - CRITICAL - normcap:151 - System info: {'normcap_version': '0.5.9', 'python_version': '3.11.11', 'cli_args': '/app/bin/normcap', 'is_briefcase_package': False, 'is_flatpak_package': True, 'is_appimage_package': False, 'platform': 'linux', 'desktop_environment': <DesktopEnvironment.KDE: 3>, 'display_manager_is_wayland': True, 'pyside6_version': '6.7.1', 'qt_version': '6.7.1', 'qt_library_path': '/usr/share/runtime/lib/plugins, /app/lib/python3.11/site-packages/PySide6/Qt/plugins, /usr/bin', 'locale': 'DEFAULT', 'config_directory': PosixPath('/home/redsnt/.var/app/com.github.dynobo.normcap/config/normcap'), 'resources_path': PosixPath('/app/lib/python3.11/site-packages/normcap/resources'), 'tesseract_path': PosixPath('/app/bin/tesseract'), 'tessdata_path': PosixPath('/home/redsnt/.var/app/com.github.dynobo.normcap/config/normcap/tessdata'), 'envs': {'TESSDATA_PREFIX': '/app/share', 'LD_LIBRARY_PATH': ''}, 'screens': [Screen(left=2560, top=180, right=4479, bottom=1259, device_pixel_ratio=1.0, index=0, screenshot=None), Screen(left=0, top=0, right=2559, bottom=1439, device_pixel_ratio=1.0, index=1, screenshot=None)]}
14:30:49 - CRITICAL - normcap:152 - Unfortunately, NormCap has to be terminated due to an unknown problem.
Please help improve NormCap by reporting this error, including the output above, on
https://github.com/dynobo/normcap/issues/new
Thanks!
Thank for reporting this!
~~It seems like the url for Danish is not correct: NormCap is trying to load~~ ~~https://github.com/tesseract-ocr/tessdata_fast/raw/4.1.0/dan_frak.traineddata~~
~~while the correct one seems to be~~
~~https://github.com/tesseract-ocr/tessdata_fast/raw/4.1.0/dan.traineddata~~
~~Can you try to download e.g. German (DE) and report back if that works?~~
~~I'll fix the URL for the next NormCap version.~~
~~Until then, a workaround is to download the dan.traineddata-file manually in your browser and move it into ~/.config/normcap/tessdata/. Don't forget to restart NormCap afterwards.~~
~~The language dan might still not be shown in NormCap's language manager, but it should become visible in the settings menu, where you can activate it.~~
Edit: Oh, wait, dan is already in available in the Language Manager, you are explicitly looking for dan_frak for the Fraktur font. Interestingly, it seems like this model is special: Unlike the other models, there seems to be no fast version for it (also no best version). But there is a default one.
Manual Workaround:
- Download model from https://github.com/tesseract-ocr/tessdata/raw/refs/tags/4.1.0/dan_frak.traineddata
- Move it into
~/.config/normcap/tessdata/ - Restart NormCap
- Activate dan_fra in the NormCap settings menu
I will fix that by using this model for dan_fra (instead of "fast") in next version of NormCap.
Great to see you're on top of this already. In case you still needed the German fraktur variant, here is the error msg:
20:58:47 - ERROR - normcap.gui.downloader:57 - Could not download 'https://github.com/tesseract-ocr/tessdata_fast/raw/4.1.0/deu_frak.traineddata'
Traceback (most recent call last):
File "/app/lib/python3.11/site-packages/normcap/gui/downloader.py", line 48, in run
with urlopen( # noqa: S310
^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 216, in urlopen
return opener.open(url, data, timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 525, in open
response = meth(req, response)
^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 634, in http_response
response = self.parent.error(
^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 563, in error
return self._call_chain(*args)
^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 496, in _call_chain
result = func(*args)
^^^^^^^^^^^
File "/usr/lib/python3.11/urllib/request.py", line 643, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found
20:58:47 - CRITICAL - normcap:148 - Uncaught exception!
Traceback (most recent call last):
File "/app/lib/python3.11/site-packages/normcap/gui/language_manager.py", line 95, in _on_download_error
QtWidgets.QMessageBox.critical(
TypeError: PySide6.QtWidgets.QMessageBox.critical(): not enough arguments. Note: keyword arguments are only supported for optional parameters.
20:58:47 - CRITICAL - normcap:151 - System info: {'normcap_version': '0.5.9', 'python_version': '3.11.11', 'cli_args': '/app/bin/normcap', 'is_briefcase_package': False, 'is_flatpak_package': True, 'is_appimage_package': False, 'platform': 'linux', 'desktop_environment': <DesktopEnvironment.KDE: 3>, 'display_manager_is_wayland': True, 'pyside6_version': '6.7.1', 'qt_version': '6.7.1', 'qt_library_path': '/usr/share/runtime/lib/plugins, /app/lib/python3.11/site-packages/PySide6/Qt/plugins, /usr/bin', 'locale': 'DEFAULT', 'config_directory': PosixPath('/home/redsnt/.var/app/com.github.dynobo.normcap/config/normcap'), 'resources_path': PosixPath('/app/lib/python3.11/site-packages/normcap/resources'), 'tesseract_path': PosixPath('/app/bin/tesseract'), 'tessdata_path': PosixPath('/home/redsnt/.var/app/com.github.dynobo.normcap/config/normcap/tessdata'), 'envs': {'TESSDATA_PREFIX': '/app/share', 'LD_LIBRARY_PATH': ''}, 'screens': [Screen(left=2560, top=180, right=4479, bottom=1259, device_pixel_ratio=1.0, index=0, screenshot=None), Screen(left=0, top=0, right=2559, bottom=1439, device_pixel_ratio=1.0, index=1, screenshot=None)]}
20:58:47 - CRITICAL - normcap:152 - Unfortunately, NormCap has to be terminated due to an unknown problem.
Please help improve NormCap by reporting this error, including the output above, on
https://github.com/dynobo/normcap/issues/new
Thanks!
I probably should've mentioned I use the flatpak version, but I quickly found the right config and tessdata folder at ~/.var/app/com.github.dynobo.normcap/config/normcap/tessdata/.
In #761, I added the fallback logic, that in case any language is not found (404) among the "fast" models, it will try to download the "normal" model or the "best" model.
Only if the model doesn't exists in any of the 3 variants (which should not be the case), then an error will be displayed.