tesseract
tesseract copied to clipboard
[Feature Request]: add an option to text2image that randomly mixes different font styles in the same line
--underline_start_prob Fraction of words to underline (value in [0,1]) (type:double default:0) --underline_continuation_prob Fraction of words to underline (value in [0,1]) (type:double default:0)
The above options for text2image create underlined words in training images.
https://github.com/tesseract-ocr/tesseract/blob/a80a8f17bb32be8bdd5124057219620b711491a7/src/training/stringrenderer.cpp#L234
I suggest expanding the same to also add some bold and italics words to images created using Pango via text2image.
http://hackage.haskell.org/package/pango-0.13.5.0/docs/Graphics-Rendering-Pango-Markup.html
C/C++ reference: https://developer.gnome.org/pango/stable/PangoMarkupFormat.html
Thanks @amitdo, for the C/C++ reference. I am hoping that adding this would lead to improved recognition.
It probably does not matter for English or other languages using Latin script because there are a large number of font variations for bold and italics, which can be used directly for training.
For other languages, the option for implementation can be EITHER like underline, a percent of the training text OR like 'exposure' values -3 -2 -1 0 1 2 3 for 'styles' Regular, Bold, Italics for the whole text.
@zdenop Please label as Feature Request
.
any updates?
I'm afraid that there are no updates, but contributions which enhance the existing code and add new useful features are welcome.
I found the solution
when you need to finetune the model on styles you have to generate data with a font in various styles and you can achieve this using tesstrain.sh with a focus on the following parameters:
--fonts_dir # path to the font folder
--fontlist # specify the font name and styles here
i try this with cairo font :
--fonts_dir font_folder # set the TTF file inside font_folder
--fontlist 'Cairo ExtraLight' 'Cairo Light' 'Cairo Regular' 'Cairo Medium' 'Cairo SemiBold' 'Cairo Bold' 'Cairo ExtraBold' 'Cairo Black'
after that you can finetune the model based on generated dataset then it will give you good results in this font with different styles
i hope this approach helps
The original request is about mixing different font styles in the same line.