naourass issues

Results 7 issues of


naourass

Keeping annotations order

### My actions before raising this issue - [x] Read/searched [the docs](https://github.com/opencv/cvat/tree/master#documentation) - [x] Searched [past issues](/issues) ### Expected Behaviour I expect the order of the annotations objects to stay...

enhancement

Latin words inside Arabic text issue

### Environment * **Tesseract Version**: tesseract 4.1.1 leptonica-1.79.0 * **Commit Number**: installed through ` apt install tesseract-ocr` * **Platform**: Linux DESKTOP-xxxxxxx 5.10.102.1-microsoft-standard-WSL2 (Ubuntu 20.04) ### Current Behavior: Tesseract fails to...

multilingual ocr

Legacy ara language not working with recent versions of tesseract

### Environment * **Tesseract Version**: 5.x, 4.1.x, 4.0.x * **Platform**: Linux DESKTOP-**** 5.10.102.1-microsoft-standard-WSL2 x86_64 GNU/Linux (Ubuntu 20.04) ### Current Behavior: While other legacy languages are working fine with recent versions...

bug

Simple image one line text not recognized for some mysterious reason

### Current Behavior Among many similar images (same dimension/layout/content) that have been ocr'd correctly, this one returns an empty string: ![Input Image](https://github.com/tesseract-ocr/tesseract/assets/4897498/4a8d6df7-d2cf-4c51-8d19-275d69030e35) I tried with `ara` and `Arabic`, both fast...

Arabic ligatures order issue when extracting text from PDF

When extracting Arabic text, the words are returned in backward order which is a normal behavior for RTL languages, and you need to use bidi algorithm to be able to...

Fixing Text Extraction Order For Arabic+Digits+Punctuation

## Explanation When you have Arabic text mixed with digits, the text extraction order is messed up. Below is an example. 1. Reading from right to left, here's the ground...

workflow-text-extraction

is-feature

Link to PyPi

If you find the library through GitHub, it's not clear that there's an installable pip package for the project. A link to PyPi package would be helpful.