tesstrain icon indicating copy to clipboard operation
tesstrain copied to clipboard

number of MAX_ITERATIONS

Open whisere opened this issue 2 years ago • 3 comments

Is this still the case: https://groups.google.com/g/tesseract-ocr/c/AnMYS98VwiE/m/1PN3mF6PAgAJ The MAX_ITERATIONS depends on the number lstmf files? If I have 1 millions pairs of images and text ground truth for training from scratch, if I want to cover all of them, should I set MAX_ITERATIONS to 1 millions? Thanks.

whisere avatar Jul 14 '22 02:07 whisere

Typically you would set MAX_ITERATIONS to a multiple of the number of lines used for training.

stweil avatar Jul 20 '22 20:07 stweil

Thanks! so is the multiple epoch: max_iterations = epoch * total number of text lines ? Are there some suggestions on the optimal multiple or epoch for training from scratch without overtraining? Thank you!

whisere avatar Jul 21 '22 00:07 whisere

If the TARGET_ERROR_RATE can't be reached after training for a long time, is it right to kill the training process and run?: lstmtraining
--stop_training
--continue_from data/eeboecco/checkpoints/eeboecco_checkpoint
--traineddata data/eeboecco/eeboecco.traineddata
--model_output data/eeboecco.traineddata &

whisere avatar Jul 27 '22 00:07 whisere

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale[bot] avatar Nov 02 '22 01:11 stale[bot]