papermerge icon indicating copy to clipboard operation
papermerge copied to clipboard

Aditional languages (rus, ukr) doesn't choosing in OCR language list

Open deimjons opened this issue 1 year ago • 1 comments

Description Hello. I am using a custom docker image with Russian and Ukrainian language packages for tesseract installed, following instructions: https://docs.papermerge.io/3.0/setup/add-ocr-langs/ Dockerfile:

FROM papermerge/papermerge:3.1

# add Ukrainian and Russian OCR languages
RUN apt install tesseract-ocr-rus tesseract-ocr-ukr

Info:

  • Papermerge Version [e.g. 3.1] In the container are added languages:
# docker exec -it papermerge bash
# tesseract --list-langs
List of available languages in "/usr/share/tesseract-ocr/5/tessdata/" (11):
deu
eng
fra
ita
nld
osd
por
ron
rus
spa
ukr

I see these languages in OCR languages list but can not choose and Run OCR

Screenshot 2024-04-19 at 18 45 39

deimjons avatar Apr 20 '24 01:04 deimjons

Yes, because "rus" and "ukr" language codes are missing in following places:

  • https://github.com/papermerge/papermerge-core/blob/master/ui/src/types.ts#L269
  • https://github.com/papermerge/papermerge-core/blob/master/ui/src/cconstants.ts#L10

I would gladly accept your pull request.

ciur avatar Apr 21 '24 05:04 ciur