tesseract icon indicating copy to clipboard operation
tesseract copied to clipboard

Warning: No shape table file present: shapetable

Open skyfire88 opened this issue 2 years ago • 1 comments

Environment:

  • Tesseract Version: Tesseract 5.0
  • Platform:Windows10

the shell:

E:\Tess-OCR\N0>tesseract.exe num.font.exp0.tif num.font.exp0 batch.nochop makebox Page 1 Page 2 Page 3

E:\Tess-OCR\N0>tesseract.exe num.font.exp0.tif num.font.exp0 -l eng --psm 7 nobatch box.train Page 1 APPLY_BOXES: Boxes read from boxfile: 6 Found 6 good blobs. Generated training data for 1 words Page 2 APPLY_BOXES: Boxes read from boxfile: 5 Found 5 good blobs. Generated training data for 1 words Page 3 APPLY_BOXES: Boxes read from boxfile: 4 Found 4 good blobs. Generated training data for 1 words

E:\Tess-OCR\N0>unicharset_extractor num.font.exp0.box Extracting unicharset from box file num.font.exp0.box Wrote unicharset file unicharset

E:\Tess-OCR\N0>mftraining -F font_properties -U unicharset -O num.unicharset num.font.exp0.tr Warning: No shape table file present: shapetable Reading num.font.exp0.tr ... Flat shape table summary: Number of shapes = 8 max unichars = 1 number with multiple unichars = 0

E:\Tess-OCR\N0>cntraining num.Roman.exp0.tr Reading num.Roman.exp0.tr ... TrainingPage:Error:Assert failed:in file ../../../src/training/cntraining.cpp, line 123

I Training only numbers images. I tried many times but failed when run in mftraining... Please help me ! Thanks!

skyfire88 avatar Apr 02 '22 14:04 skyfire88

Do you have clue what are you doing? Why are you trying to train legacy engine (tesseract <=3.x) with tesseract 5? What are you expecting to get?

zdenop avatar Apr 03 '22 16:04 zdenop

The training tools for the legacy OCR engine were broken since version 5.0. They were fixed recently.

Try the latest code in the main branch of this repo, or wait for v5.3.0, which will be released soon.

amitdo avatar Dec 22 '22 07:12 amitdo