Edouard Belval
Edouard Belval
Okay so quickly: - Add multiprocessing - Properly time the different parts of the scripts, has not everything is the call to TRDG. You can run `py-spy` to see function...
You initial comment was removed/edited away. But if you really generate images with 40 words/image, having a performance speed of 11-13 imgs/sec is not that bad. I can try to...
I see. I never benchmarked each options, so maybe try removing one of these lines and measure the impact of processing time: ``` distorsion_type=np.random.choice(distorsion_type), skewing_angle=np.random.choice(skewing_angle), blur=np.random.choice(blur), ```
To be clear, you have multiple custom fonts that you want to use when generating?
That's the part I didn't understand from your initial issue. I can confirm that there is no support for that right now, but it's a feature that I wanted to...
Just put a single font path in the `fonts` list. For example, with `~/my_font.ttf` you would do ```py generator = GeneratorFromStrings( ["Test1","Test2"], fonts=['~/my_font.ttf',], background_type=3 ) ```
From the top of my head, usually binarizing produces substandard OCR because it implies thresholding the image to decide whether the text is black OR white. This causes the font...
This particular problem arise when you didn't compile Tesseract with the standard library. It was added 2 weeks ago in the default makefile so you might want to rebuild tesseract...
tesseract-ocr/tesseract@b0ead95d64a3667339775b2f99ac37e97e90c2a0 should be good, but did you `sudo make uninstall` before reinstalling Tesseract 4?
@zqy2084 If you still experience an issue with that revision, please post the error, it'll be easier to debug.