papermerge
papermerge copied to clipboard
Aditional languages (rus, ukr) doesn't choosing in OCR language list
Description Hello. I am using a custom docker image with Russian and Ukrainian language packages for tesseract installed, following instructions: https://docs.papermerge.io/3.0/setup/add-ocr-langs/ Dockerfile:
FROM papermerge/papermerge:3.1
# add Ukrainian and Russian OCR languages
RUN apt install tesseract-ocr-rus tesseract-ocr-ukr
Info:
- Papermerge Version [e.g. 3.1] In the container are added languages:
# docker exec -it papermerge bash
# tesseract --list-langs
List of available languages in "/usr/share/tesseract-ocr/5/tessdata/" (11):
deu
eng
fra
ita
nld
osd
por
ron
rus
spa
ukr
I see these languages in OCR languages list but can not choose and Run OCR
Yes, because "rus" and "ukr" language codes are missing in following places:
- https://github.com/papermerge/papermerge-core/blob/master/ui/src/types.ts#L269
- https://github.com/papermerge/papermerge-core/blob/master/ui/src/cconstants.ts#L10
I would gladly accept your pull request.