Uwe Hartwig

Results 61 comments of Uwe Hartwig
trafficstars

Hello, are there still any plans to integrate some kind of tool into tesstrain? I was facing similar requirements for generation of training data in a windows-env, which ended up...

@wrznr I must confess: There are some caveats. It adds another dependency, python-opencv. Pillow kept complaining about images >80 MB. Further, on Windows 10, one needs additionally to install C++14.0-Buildtools,...

Encountered similar issue on "born digitals" including rather few images and a text layer with many uncopy-able control characters. Need to redo only the texts. With `--force-ocr` file size increases...

For the arabic text that is included as text resource (288652), and that's causing trouble with bidi, please see the original image (binarized) ![288652](https://user-images.githubusercontent.com/4450366/99662030-02aeb680-2a65-11eb-8f0b-a31bf156c130.jpg)

@Shreeshrii Thanks for pointing to PAGE-Files that miss `Word' elements at all! - Since that was the cause for the missing results in the provided Devanagari sample. I tried to...

@Shreeshrii Thanks for pointing towards ALTO V4. I've missed this before, since we're using the latest official stable release, tesseract 4.1., which doesn't create this kind of ALTO data. I've...

@Shreeshrii Please note, test images are just created on-the-fly, with a library that is out-of-the-box just able to render a very small subset of UTF-8 chars, I guess only ASCII,...

@Shreeshrii Regarding the lastest version: currently, there's only a-pre-beta-version (0.0.1) annotated in the `setup.py`. Usually this would be the place to follow versioning. I do not know how to utilize...

@Shreeshrii You're welcome! ... Sorry for the confusion regarding RTL ... finally, it turned out that the `-r` flag aims at something different than _real_ RTL which can be handled...