tesseract [Feature Request]: add an option to text2image that randomly mixes different font styles in the same line

[Feature Request]: add an option to text2image that randomly mixes different font styles in the same line

Open Shreeshrii opened this issue 6 years ago • 7 comments

--underline_start_prob Fraction of words to underline (value in [0,1]) (type:double default:0) --underline_continuation_prob Fraction of words to underline (value in [0,1]) (type:double default:0)

The above options for text2image create underlined words in training images.

https://github.com/tesseract-ocr/tesseract/blob/a80a8f17bb32be8bdd5124057219620b711491a7/src/training/stringrenderer.cpp#L234

I suggest expanding the same to also add some bold and italics words to images created using Pango via text2image.

http://hackage.haskell.org/package/pango-0.13.5.0/docs/Graphics-Rendering-Pango-Markup.html

Jun 15 '18 20:06 Shreeshrii

C/C++ reference: https://developer.gnome.org/pango/stable/PangoMarkupFormat.html

Jun 15 '18 21:06 amitdo

Thanks @amitdo, for the C/C++ reference. I am hoping that adding this would lead to improved recognition.

It probably does not matter for English or other languages using Latin script because there are a large number of font variations for bold and italics, which can be used directly for training.

For other languages, the option for implementation can be EITHER like underline, a percent of the training text OR like 'exposure' values -3 -2 -1 0 1 2 3 for 'styles' Regular, Bold, Italics for the whole text.

Jun 16 '18 08:06 Shreeshrii

@zdenop Please label as Feature Request.

Sep 27 '18 12:09 Shreeshrii

any updates?

Jan 23 '24 11:01 AhmadHakami

I'm afraid that there are no updates, but contributions which enhance the existing code and add new useful features are welcome.

Jan 23 '24 11:01 stweil

I found the solution

when you need to finetune the model on styles you have to generate data with a font in various styles and you can achieve this using tesstrain.sh with a focus on the following parameters:

--fonts_dir   # path to the font folder
--fontlist    # specify the font name and styles here

i try this with cairo font :

--fonts_dir font_folder # set the TTF file inside font_folder
--fontlist 'Cairo ExtraLight' 'Cairo Light' 'Cairo Regular' 'Cairo Medium' 'Cairo SemiBold' 'Cairo Bold' 'Cairo ExtraBold' 'Cairo Black'

after that you can finetune the model based on generated dataset then it will give you good results in this font with different styles

i hope this approach helps

Jan 24 '24 06:01 AhmadHakami

The original request is about mixing different font styles in the same line.

Jan 24 '24 07:01 amitdo

tesseract tesseract copied to clipboard

[Feature Request]: add an option to text2image that randomly mixes different font styles in the same line

tesseract
tesseract copied to clipboard