preserve_interword_spaces in tesseract
Hi Team, Currently i am using pyocr with tesseract 3.05.01. I am using pyocr.get_available_tools() to get tesseract. Is there any way i can preserve_interword_spaces for tesseract with help of pyocr.
Assuming you're using Tesseract (pyocr.tesseract) and not (pyocr.libtesseract) then yes, you can. You can make your own builder.
See DigitBuilder and the other builders for reference.
My suggestion: Inherit from TextBuilder and in the constructor, just after calling TextBuilder, set self.tesseract_flags and self.tesseract_configs as you need.
Then just pass your new builder to pyocr.tesseract.image_to_string() (aka pyocr.get_avalailable_tools()[0].image_to_string()), and you should get the expected result.