Tesseract-OCR-iOS icon indicating copy to clipboard operation
Tesseract-OCR-iOS copied to clipboard

Tesseract 4.0

Open ekrivopaltsev opened this issue 8 years ago • 16 comments

Hi Folks,

Do you have on a roadmap a plan to have a new version based on Tesseract 4.x ? I saw preliminary results and it seems 4.x has a lot of potentials.

Thanks, --eugene

ekrivopaltsev avatar Feb 15 '17 22:02 ekrivopaltsev

Hi,I have forked this repo, and update to the new version 4.00.000alpha. https://github.com/chaoskyme/Tesseract-OCR-iOS.git
You can use it with Pod. pod 'TesseractOCRiOS', :git => 'https://github.com/chaoskyme/Tesseract-OCR-iOS.git'

xwal avatar Feb 17 '17 01:02 xwal

Hi Alex,

Thank you for the update, it is great. Could you please let me know if I could use traineddata prepared for tesseract 3.x ?

Thanks,

Eugene

Please excuse my typos...

On Feb 16, 2017, at 5:21 PM, ALEX LIN [email protected] wrote:

Hi,I have forked this repo, and update to the new version 4.00.000alpha. https://github.com/chaoskyme/Tesseract-OCR-iOS.git You can use it with Pod. pod 'TesseractOCRiOS', :git => 'https://github.com/chaoskyme/Tesseract-OCR-iOS.git'

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

ekrivopaltsev avatar Feb 17 '17 01:02 ekrivopaltsev

You can download for https://github.com/tesseract-ocr/tessdata/tree/4d64457aacbc781a94d6aefc125765c3949c8827, tessdata release only has 3.04.00 and 4.00, so you can browse before 3.04.00 release commit.
Or, you also can download for https://sourceforge.net/projects/tesseract-ocr-alt/files/?source=navbar, there are tesseract 3.02 traineddata.

xwal avatar Feb 17 '17 01:02 xwal

Hi Alex, Thank you very much for the lead. My question was related to tessdata set compatibility as have my own set trained for special fonts. I tried your branch today, #1 it works :) Secondly it works about 30-40% slower than 3.4. Based on general 4.x discussion I was afraid of significant memory increase but found only slight increase in tesseract library 5.4 mb vs 5 mb for 3.4. I did not test run-time memory differences. Also in terms of accuracy 3.4.provides so far better results with my private tessdata. I am very interested in your progress. Please keep me posted. Regards, --eugene

  From: ALEX LIN <[email protected]>

To: gali8/Tesseract-OCR-iOS [email protected] Cc: ekrivopaltsev [email protected]; Author [email protected] Sent: Thursday, February 16, 2017 5:42 PM Subject: Re: [gali8/Tesseract-OCR-iOS] Tesseract 4.0 (#311)

You can download for https://github.com/tesseract-ocr/tessdata/tree/4d64457aacbc781a94d6aefc125765c3949c8827, tessdata release only has 3.04.00 and 4.00, so you can browse before 3.04.00 release commit. Or, you also can download for https://sourceforge.net/projects/tesseract-ocr-alt/files/?source=navbar, there are tesseract 3.02 traineddata.— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or mute the thread.

ekrivopaltsev avatar Feb 17 '17 18:02 ekrivopaltsev

@chaoskyme Hey there, I am trying to update to the 4.0alpha based on this recommendation: https://github.com/tesseract-ocr/tesseract/issues/963#event-1104463348

I updated my Podfile to your fork but I am getting the exact same results as I got the above referenced issue. Would you mind testing that file and seeing if you get the same odd result? I am using these tessdata files: https://github.com/tesseract-ocr/tessdata/releases/tag/3.04.00

Just the english language, are these the right files? screen shot 2017-05-31 at 12 25 37 pm

davecoffin avatar May 31 '17 16:05 davecoffin

@chaoskyme Hi, thank you so much for updating it to 4.0.0 version. By any chance you can share your experiences how you compile the source code? I followed the instructions but did not work for me. 十分感谢!

nilinyi avatar Aug 25 '17 19:08 nilinyi

@nilinyi Hi, there has a makefile in the TesseractOCR directory, you can run make to build libs.

xwal avatar Aug 28 '17 02:08 xwal

@chaoskyme Thank you so much! However, I got the same problem as the one described here. Issue #327 By the way, I also updated the libpng in the makefile to be version 1.6.32.

image

nilinyi avatar Sep 12 '17 02:09 nilinyi

@chaoskyme

Thanks for your fork to support V4.0 Alpha. And it supports traditional traineddata (file size is larger) very well.

The problem is that it does not support new LSTM traineddata (file size is smaller), for example, fast OR best.

it reports the following error message:

"Failed loading language 'eng' "Tesseract couldn't load any languages!""

Any clue to solve this issue?

Many Thanks!

drwjf avatar Oct 03 '17 01:10 drwjf

@drwjf instead of using G8Tesseract(language:"eng"), use G8Tesseract(language:"eng", engineMode:G8OCREngineMode.cubeOnly)

leinakesi avatar Oct 27 '17 06:10 leinakesi

@leinakesi Such a basic knowledge but still saved my day!

kevlud avatar Nov 19 '17 18:11 kevlud

Has anyone managed to update that fork to a newer version of tesseract? The make file fails for me:

../../src/gplot.c:399:14: error: 'system' is unavailable: not available on iOS
    ignore = system(buf);  /* gnuplot || wgnuplot */

hactar avatar Sep 09 '18 00:09 hactar

Hi there, Any updates on that? :)

oleghnidets avatar Nov 13 '18 13:11 oleghnidets

Hi, everyone. I have upgraded tesseract from 4.00.000alpha to 4.0.0 release. Wellcome PR to improvement it. https://github.com/chaoskyme/Tesseract-OCR-iOS

xwal avatar Dec 07 '18 17:12 xwal

Hi, I want to ask you that I developed ios and I used my own training library in my project but unfortunately it gave me a bug and I crashed

KGDeveloper avatar Jan 26 '19 02:01 KGDeveloper

Hey guys, for trouble with compilation, check this out.

https://github.com/chaoskyme/Tesseract-OCR-iOS/issues/3

I faced the same issues while upgrading to Tesseract 4.1, and after thorough (and painful) investigation got it working.

hejtmii avatar Oct 20 '19 19:10 hejtmii