Tesseract-OCR-iOS Using traineddata from tesseract-ocr

Hello,

I wish to know how to use the traineddata available from tesseract-ocr without inducing

actual_tessdata_num_entries_ <= TESSDATA_NUM_ENTRIES:Error:Assert failed:in file tessdatamanager.cpp, line 53

Much appreciated.

Dec 01 '16 08:12 takzee

+1

Dec 02 '16 15:12 mgerdt

+1

Dec 06 '16 06:12 MaxTalic

I was able to resolve it by using a different version of the traineddata file that I borrowed from a Tesseract tutorial posted elsewhere. I found the installation instructions for Tesseract iOS repo work perfectly, but the current version of thetraineddata does not work with 4.0.0. Details here

Update: Upon further digging, I discovered this version tessdata has the same eng.traineddata file I used to get my project working.

Dec 15 '16 15:12 AdrianBinDC

+1

Dec 19 '16 18:12 mirko-fairr

+1

Dec 20 '16 12:12 flyweights

+1

Dec 22 '16 12:12 neoneye

Apparently this is a very common issue that I keep getting upvotes for this but nobody actually knows how to solve it.

Dec 23 '16 06:12 takzee

@takzee I was able to solve it using the link in my post above.

Dec 23 '16 17:12 AdrianBinDC

@AdrianBinDC Yes that is a workaround, I think this issue should be looked into since the workaround only applicable to English or luckily some other languages where there are available trained files.

Dec 27 '16 02:12 takzee

I believe this issue could be solved by upgrading the Tesseract version to 3.04 so that it is sync with the training data here: https://github.com/tesseract-ocr/langdata.

There is at least one fork were this is done, e.g. https://github.com/exherb/Tesseract-OCR-iOS

Jan 05 '17 10:01 FWJonathan

https://github.com/exherb/Tesseract-OCR-iOS This example in Chinese is ok, thank you. @FWJonathan

Jan 06 '17 02:01 flyweights

@AdrianBinDC Thanks for saving my time!! Your solution really works for me. I installed it using pod 'TesseractOCRiOS', '4.0.0' and it just crashed.

Jan 11 '17 02:01 CaliosD

To resolve this issue use older version of training data from: https://github.com/tesseract-ocr/tessdata/tree/3.04.00. Worked for me.

Jan 12 '17 17:01 bibhas2

As friend checked, for Android version, their using newest tessdata file have better OCR result with Chinese..Is anyone know how to update the Tesseract library?

Jan 16 '17 03:01 freedylam

how to fixit... help me....

actual_tessdata_num_entries_ <= TESSDATA_NUM_ENTRIES:Error:Assert failed:in file tessdatamanager.cpp, line 53

Jan 19 '17 10:01 monxarat

Switching back to Tessdata 3.0.4 allows the program to compile but the results are horrendous. I supplied a very simple image with English words and the program failed to recognize it coherently. I wonder if the 4.0.0 version would be better. However, I'm still experiencing that error as of the latest master.

Mar 07 '17 05:03 hudaniel

@computerion Thank you very much.

Mar 10 '17 06:03 monxarat

@AdrianBinDC Thank you~~~

Mar 16 '17 07:03 SuperZico

[email protected] is working in android. And I found accuracy rate of [email protected] is better than this version tessdata

Apr 12 '17 06:04 mdsb100

@gali8 Get some help, please.

May 15 '17 01:05 mdsb100

Hello. I have a problem with japanese languge. i hope get hep! thank you so much

Jul 31 '17 08:07 hungnmai

@AdrianBinDC thanks for helping. It works with 4.0 version on iOS

Aug 09 '17 20:08 brkyvrkn

The previous version data won't crash. But can't recognize anything.

My code:

    let tesseract:G8Tesseract = G8Tesseract(language:"eng");    
    tesseract.delegate = self;
    tesseract.charWhitelist = "01234567890";
    tesseract.image = UIImage(named: "sample.jpg")
    tesseract.recognize();
    
    NSLog("%@", tesseract.recognizedText);

The image:

sample

The result: empty!

Aug 14 '17 17:08 zhouhao27

+1 I have problem with Thai language @gali8, any idea how to resolve it?

Sep 14 '17 09:09 ckgal

Hello, I am using the tessdata from this repo https://github.com/tesseract-ocr/tessdata/tree/bf82613055ebc6e63d9e3b438a5c234bfd638c93

But they won't work with pod 'TesseractOCRiOS', '4.0.0'

My goal is to use this project https://github.com/vinhvu200/BillSplit with other language traineddata but I don't know where to find out which tess data version i have to use? I would try them all but I cloned several versions (2GB!) and my internet connection is not that fast.

Aug 01 '18 08:08 ghost

hello,https://github.com/gali8/Tesseract-OCR-iOS/issues/299int returnCode = self.tesseract->Init(self.absoluteDataPath.fileSystemRepresentation, self.language.UTF8String, (tesseract::OcrEngineMode)self.engineMode, (char **)configs, count, &tessKeys, &tessValues, false);This is my console print messageactual_tessdata_num_entries_ <= TESSDATA_NUM_ENTRIES:Error:Assert failed:in file ../../ccutil/tessdatamanager.cpp, line 53,I hope to get your help

Jan 19 '19 02:01 KGDeveloper

I believe this issue could be solved by upgrading the Tesseract version to 3.04 so that it is sync with the training data here: https://github.com/tesseract-ocr/langdata.

There is at least one fork were this is done, e.g. https://github.com/exherb/Tesseract-OCR-iOS

As friend checked, for Android version, their using newest tessdata file have better OCR result with Chinese..Is anyone know how to update the Tesseract library?

Hello, have you solved this problem? I have a similar problem. Android and Windows worked fine, iOS crashed. I compared the OCR version of android, which is 3.0.5. I plan to recompile the Submodule dependency library version to solve this problem, but there are more problems after the revision. Has anyone tried to upgrade successfully?

Mar 18 '20 03:03 zhuozhuo

I found this to solve the problem. https://github.com/chaoskyme/Tesseract-OCR-iOS

Mar 25 '20 01:03 zhuozhuo

thanks @AdrianBinDC , your suggested traineddata files are comaptible when used on Android & iOS. I do have a question. From what i understand, the traineddata files from normal_tessdata directory are compatible with Android & iOS. But the traineddata files from tessdata_best & tessdata_fast directories are not compatible with Android & iOS platforms, and give the error TessBaseAPIInit3(tessHandle,dataPath,lang) != 0 .

I need to perform some additional training on eng.traineddata file, for which i must use traineddata file from tessdata_best directory. But files from this directory are not compatible when used on Android & iOS platforms.

Any solutions on how to make the file from tessdata_best directory run on Android? Why files from "tessdata" are compatible, but those from "tessdata_best" are not?

[ i am using Tesseract ver 4.1]

Thanks...

Mar 27 '20 11:03 Kunal-git

Tesseract-OCR-iOS Tesseract-OCR-iOS copied to clipboard

Using traineddata from tesseract-ocr

Tesseract-OCR-iOS
Tesseract-OCR-iOS copied to clipboard