normcap icon indicating copy to clipboard operation
normcap copied to clipboard

Arabic instead of English

Open faveoled opened this issue 1 year ago • 2 comments

What happened?

All English text gets recognized into some Arabic symbols. Latest AppImage

How did you install NormCap?

AppImage (Linux)

Operating System + Version?

Ubuntu 22.04.3

[Linux only] Display Server (DS) + Desktop environment (DE)?

Xorg

Debug log output?*

13:00:38 - INFO    - normcap:49 - Start NormCap v0.5.4
13:00:38 - DEBUG   - normcap:107 - Append /tmp/.mount_NormCaZT6zGv/usr/bin to AppImage internal PATH
13:00:38 - DEBUG   - normcap.gui.tray:77 - System info:
{'normcap_version': '0.5.4', 'python_version': '3.10.13', 'cli_args': '/tmp/.mount_NormCaZT6zGv/usr/app/normcap/__main__.py -v debug', 'is_briefcase_package': True, 'is_flatpak_package': False, 'is_appimage_package': True, 'platform': 'linux', 'desktop_environment': <DesktopEnvironment.GNOME: 1>, 'display_manager_is_wayland': False, 'pyside6_version': '6.6.1', 'qt_version': '6.6.1', 'qt_library_path': '/tmp/.mount_NormCaZT6zGv/usr/app_packages/PySide6/Qt/plugins, /tmp/.mount_NormCaZT6zGv/usr/python/bin', 'locale': 'DEFAULT', 'config_directory': PosixPath('/home/user/.config/normcap'), 'resources_path': PosixPath('/tmp/.mount_NormCaZT6zGv/usr/app/normcap/resources'), 'tesseract_path': PosixPath('/tmp/.mount_NormCaZT6zGv/usr/bin/tesseract'), 'tessdata_path': PosixPath('/home/user/.config/normcap/tessdata'), 'envs': {'TESSDATA_PREFIX': None, 'LD_LIBRARY_PATH': None}, 'screens': [Screen(left=0, top=0, right=1365, bottom=767, device_pixel_ratio=1.0, index=0, screenshot=None)]}
13:00:38 - DEBUG   - normcap.gui.settings:162 - Skip update of non existing setting (show_introduction: None)
13:00:38 - DEBUG   - normcap.gui.settings:162 - Skip update of non existing setting (cli_mode: False)
13:00:38 - DEBUG   - normcap.gui.settings:162 - Skip update of non existing setting (background_mode: False)
13:00:38 - DEBUG   - normcap.gui.settings:162 - Skip update of non existing setting (clipboard_handler: None)
13:00:38 - DEBUG   - normcap.gui.tray:388 - Listen on local socket v0.5.4-normcap.
13:00:38 - DEBUG   - normcap.screengrab.main:20 - Select capture method QT
13:00:38 - DEBUG   - normcap.gui.utils:22 - Save debug image as /tmp/normcap/2024-01-18_10-00-38_raw_screen0.png
13:00:38 - DEBUG   - normcap.gui.window:52 - Create window for screen 0
13:00:38 - DEBUG   - normcap.gui.window:128 - Set window of screen 0 to fullscreen
13:00:38 - DEBUG   - normcap:183 - [QT] qtwarningmsg - qsystemtrayicon::setvisible: no icon set
13:00:38 - DEBUG   - normcap.ocr.tesseract:24 - Executing '/tmp/.mount_NormCaZT6zGv/usr/bin/tesseract --list-langs --tessdata-dir /home/user/.config/normcap/tessdata'
13:00:38 - DEBUG   - normcap.ocr.tesseract:37 - Tesseract command output: List of available languages in "/home/user/.config/normcap/tessdata/" (6): ¬ ara ¬ chi_sim ¬ deu ¬ eng ¬ rus ¬ spa ¬
13:00:44 - DEBUG   - normcap.gui.tray:354 - Hide 1 window
13:00:44 - INFO    - normcap.gui.tray:246 - Crop image to region (375, 428, 666, 463)
13:00:44 - DEBUG   - normcap.gui.utils:22 - Save debug image as /tmp/normcap/2024-01-18_10-00-44_cropped.png
13:00:44 - DEBUG   - normcap.gui.tray:271 - Start OCR
13:00:44 - DEBUG   - normcap.ocr.enhance:84 - Scale image x2
13:00:44 - DEBUG   - normcap.ocr.enhance:57 - Pad image by 80px
13:00:44 - DEBUG   - normcap.ocr.recognize:35 - Run Tesseract on image of size (744, 232) with args:
TessArgs(tessdata_path=PosixPath('/home/user/.config/normcap/tessdata'), lang='ara', oem=<OEM.DEFAULT: 3>, psm=<PSM.AUTO: 3>)
13:00:44 - DEBUG   - normcap.ocr.tesseract:24 - Executing '/tmp/.mount_NormCaZT6zGv/usr/bin/tesseract /tmp/tmphgsz7_tg/normcap_tesseract_input.png /tmp/tmphgsz7_tg/normcap_tesseract_input.png -c tessedit_create_tsv=1 -l ara --oem 3 --psm 3 --tessdata-dir /home/user/.config/normcap/tessdata -c tessedit_write_images=1 -c tessedit_dump_pageseg_images=1'
13:00:44 - DEBUG   - normcap.ocr.tesseract:37 - Tesseract command output: 
13:00:44 - DEBUG   - normcap.ocr.tesseract:67 - Skip moving file to temp dir, it does not exist: /tmp/tmphgsz7_tg/normcap_tesseract_input.png.png_debug.pdf
13:00:44 - DEBUG   - normcap.ocr.recognize:44 - OCR result:
OcrResult(tess_args=TessArgs(tessdata_path=PosixPath('/home/user/.config/normcap/tessdata'), lang='ara', oem=<OEM.DEFAULT: 3>, psm=<PSM.AUTO: 3>), words=[{'level': 5, 'page_num': 1, 'block_num': 1, 'par_num': 1, 'line_num': 1, 'word_num': 1, 'left': 499, 'top': 96, 'width': 152, 'height': 26, 'conf': 4.121132, 'text': 'ع1939ممة.'}, {'level': 5, 'page_num': 1, 'block_num': 1, 'par_num': 1, 'line_num': 1, 'word_num': 2, 'left': 356, 'top': 96, 'width': 132, 'height': 27, 'conf': 41.292465, 'text': '86_64)-4'}, {'level': 5, 'page_num': 1, 'block_num': 1, 'par_num': 1, 'line_num': 1, 'word_num': 3, 'left': 337, 'top': 110, 'width': 3, 'height': 6, 'conf': 74.659691, 'text': '.'}, {'level': 5, 'page_num': 1, 'block_num': 1, 'par_num': 1, 'line_num': 1, 'word_num': 4, 'left': 318, 'top': 96, 'width': 8, 'height': 20, 'conf': 85.199936, 'text': '5'}, {'level': 5, 'page_num': 1, 'block_num': 1, 'par_num': 1, 'line_num': 1, 'word_num': 5, 'left': 299, 'top': 110, 'width': 5, 'height': 6, 'conf': 59.467552, 'text': '.'}, {'level': 5, 'page_num': 1, 'block_num': 1, 'par_num': 1, 'line_num': 1, 'word_num': 6, 'left': 117, 'top': 76, 'width': 170, 'height': 51, 'conf': 0.0, 'text': '6-موعمعهاة/'}, {'level': 5, 'page_num': 1, 'block_num': 1, 'par_num': 1, 'line_num': 1, 'word_num': 7, 'left': 99, 'top': 110, 'width': 7, 'height': 6, 'conf': 67.63858, 'text': '.'}], image=<PySide6.QtGui.QImage(QSize(744, 232),format=QImage::Format_RGB32,depth=32,devicePixelRatio=1,bytesPerLine=2976,sizeInBytes=690432) at 0x7f7a63936fc0>, magic_scores={}, parsed='')
13:00:44 - INFO    - normcap.ocr.magics.email_magic:60 - 0 emails found 
13:00:44 - DEBUG   - normcap.ocr.magics.email_magic:71 - 0/32 (0.0) chars in emails
13:00:44 - INFO    - normcap.ocr.magics.url_magic:57 - 0 URLs found 
13:00:44 - DEBUG   - normcap.ocr.magics.url_magic:65 - 0/38 (0.0) chars in urls
13:00:44 - DEBUG   - normcap.ocr.magic:82 - Magic scores:
{'SingleLineMagic': 50, 'MultiLineMagic': 0, 'ParagraphMagic': 0.0, 'EmailMagic': 0.0, 'UrlMagic': 0.0}
13:00:44 - DEBUG   - normcap.ocr.recognize:48 - Parsed text:
ع1939ممة. 86_64)-4 . 5 . 6-موعمعهاة/ .
13:00:44 - DEBUG   - normcap.gui.utils:22 - Save debug image as /tmp/normcap/2024-01-18_10-00-44_enhanced.png
13:00:44 - INFO    - normcap.gui.tray:289 - Text from OCR:
ع1939ممة. 86_64)-4 . 5 . 6-موعمعهاة/ .
13:00:44 - DEBUG   - normcap.gui.tray:332 - Copy text to clipboard
13:00:44 - DEBUG   - normcap.clipboard.handlers.windll:187 - WindllHandler is incompatible on non-Windows systems
13:00:44 - DEBUG   - normcap.clipboard.handlers.pbcopy:24 - PbCopyHandler is incompatible on non-macOS systems
13:00:44 - DEBUG   - normcap.clipboard.handlers.qtclipboard:34 - QtCopyHandler is compatible
13:00:44 - DEBUG   - normcap.clipboard.handlers.wlclipboard:34 - WlCopyHandler is not compatible on non-Linux systems and on Linux w/o Wayland
13:00:44 - DEBUG   - normcap.clipboard.handlers.xclip:38 - XclipCopyHandler is compatible
13:00:44 - DEBUG   - normcap.clipboard.main:84 - Compatible clipboard handlers: ['qt', 'xclip']
13:00:44 - DEBUG   - normcap.clipboard.handlers.qtclipboard:38 - QtCopyHandler requires no dependencies
13:00:44 - DEBUG   - normcap.clipboard.handlers.xclip:46 - XclipCopyHandler dependencies are installed (/tmp/.mount_NormCaZT6zGv/usr/bin/xclip)
13:00:44 - DEBUG   - normcap.clipboard.main:89 - Available clipboard handlers: ['qt', 'xclip']
13:00:44 - DEBUG   - normcap.clipboard.handlers.qtclipboard:34 - QtCopyHandler is compatible
13:00:44 - DEBUG   - normcap.clipboard.main:56 - Text copied to clipboard using 'qt.' handler
13:00:44 - DEBUG   - normcap.gui.notification:132 - Send notification via notify-send
13:00:49 - INFO    - normcap.gui.tray:610 - Exit normcap
13:00:49 - DEBUG   - normcap.gui.tray:611 - Debug images saved in /tmp/normcap

faveoled avatar Jan 18 '24 10:01 faveoled

Fixed by removing .config/normcap. Note that I didn't edit the files there ever. I used your app for some time in 0.3 version

faveoled avatar Jan 18 '24 10:01 faveoled

Yeah, sorry about that, this was a known issue for updating from 0.3 to 0.4 (or above). :see-no-evil:

#372 explains the details.

If someone else reads this issue: Before deleting the config files, try toggling Arabic (ARA) on and off in the Settings menu, then toggle English (ENG) off and on. That should do the trick, too.

dynobo avatar Jan 19 '24 07:01 dynobo