tesstrain icon indicating copy to clipboard operation
tesstrain copied to clipboard

Training with python: run training step ?

Open forzagreen opened this issue 9 months ago • 1 comments

As mentioned by @stefan6419846 in https://github.com/madmaze/pytesseract/issues/508 , there is a python wrapper for training in tesstrain/src/ , which unfortunately is not documented in tesseract, tessdoc and tesstrain repositories.

From my understanding: (please correct me if I'm wrong)

  1. It only generates lstmf files, and does not perform any training. In the steps mentioned in Overview of Training Process, it stops at step 5. Steps 6 and 7 must be done separately. Is that correct ?

  2. How to perform steps 6 and 7 ? with Makefile commands ? if you give me some inputs, I can help adding these steps to the python script.

  3. The python script takes a TEXTFILE and generates (for each font) box/tif/lstmf files for the hole text, not line by line. So, in order to generate line by line, we must run the script for each one-line file ?

Thanks in advance !

Cc: @stefan6419846

forzagreen avatar Sep 18 '23 07:09 forzagreen