ruby-tesseract-ocr icon indicating copy to clipboard operation
ruby-tesseract-ocr copied to clipboard

CompilationError with tesseract-ocr 3.04

Open atuyosi opened this issue 9 years ago • 21 comments

I'm getting a CompilationError when 'require tesseract-ocr'.

CompilationError: compile error: see logs at /tmp/.ffi-inline-1000/00ac1de4050b632b230475bd71c0dc3a7de45a89.log from /usr/lib/ruby/gems/2.2.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/compilers/gcc.rb:35:in `compile'

full trace is here, and ffi-inline's error log

Is the latest tesseract-ocr( 3.04) supported? or any API changed?

There are similar problem bellow.

ruby on rails - Tesseract-ocr gem issue on mac os x - Stack Overflow

OS: Arch Linux gem

$ gem list tesseract-ocr -d

*** LOCAL GEMS ***

tesseract-ocr (0.1.8)
    Author: meh.
    Homepage: http://github.com/meh/ruby-tesseract-ocr
    License: BSD
    Installed at: /usr/lib/ruby/gems/2.2.0

    A wrapper library to the tesseract-ocr API.

tesseract

$ tesseract -v
tesseract 3.04.00
 leptonica-1.71
  libgif 5.1.0 : libjpeg 8d : libpng 1.6.18 : libtiff 4.0.4 : zlib 1.2.8 : libwebp 0.4.3

ruby

$ ruby -v
ruby 2.2.3p173 (2015-08-18 revision 51636) [x86_64-linux]

Thanks in Advance.

atuyosi avatar Aug 27 '15 17:08 atuyosi

Yeah, it looks like they changed quite some stuff, especially regarding output.

It will take some time, in the meantime you can use downgrade or downgrader from the AUR.

meh avatar Aug 27 '15 17:08 meh

OK. Thanks.

atuyosi avatar Aug 27 '15 18:08 atuyosi

:+1:

McRip avatar Aug 28 '15 12:08 McRip

do you know which stuff changed? perhaps I can help

acrogenesis avatar Sep 02 '15 22:09 acrogenesis

@acrogenesis it looks like they added a TessRenderResult class which is used in place of STRING for ProcessPages.

meh avatar Sep 02 '15 22:09 meh

The changes in the following fork fixed the problem for me with the Tesseract 3.04 baseline: https://github.com/ortutay/ruby-tesseract-ocr/commit/74a4042a07da0f8bf54d06ff01a1647bbdeeac92

This also applies to MacOS and Tesseract installed via Homebrew which now defaults to 3.04.

@meh can you share your thoughts on this change

cxhartmann avatar Oct 20 '15 20:10 cxhartmann

@cxhartmann the problem is the Ruby side of things expect process_page to store its value in a STRING*, which is not the case anymore.

With that change it's going to compile, but it's going to segfault or worse as soon as you use anything related to process_page.

meh avatar Oct 20 '15 21:10 meh

@meh Ah I see. So there is more to it. Bummer, but only if you use process_page? I'd have to guess it might be more than just that.

For now I'm reverting to Tesseract v 3.02 and that seems to be working. Now that homebrew points to 3.04 (as of Sept), I went ahead and just brew uninstalled and sucked down the old homebrew formula to do the 3.02 build for me and that seems to be working fine. https://github.com/Homebrew/homebrew/blob/master/Library/Formula/tesseract.rb (check a few revisions back)

cxhartmann avatar Oct 20 '15 23:10 cxhartmann

@cxhartmann yes, and the biggest problem is getting this gem to work with both pre and post 3.04.

meh avatar Oct 21 '15 00:10 meh

@meh Any word on supporting 3.04?

tpendragon avatar Jan 25 '16 22:01 tpendragon

Haven't had the time to work on it unfortunately, it's on my endless TODO list :rage4:

meh avatar Jan 25 '16 22:01 meh

+1

alexhanh avatar Jul 13 '16 08:07 alexhanh

I just wanted to use easy_captcha_solver ruby gem that requires tesseract-ocr ruby gem. It installed without error but when I try to use it I see that tesseract-ocr is failing to compile.

OS :

$ cat /etc/*-release                                                                                                                                                                                                                        
DISTRIB_ID=ManjaroLinux
DISTRIB_RELEASE=16.08
DISTRIB_CODENAME=Ellada
DISTRIB_DESCRIPTION="Manjaro Linux"
Manjaro Linux
NAME="Manjaro Linux"
ID=manjaro
PRETTY_NAME="Manjaro Linux"
ANSI_COLOR="1;32"
HOME_URL="http://www.manjaro.org/"
SUPPORT_URL="http://www.manjaro.org/"
BUG_REPORT_URL="http://bugs.manjaro.org/"

Gem :

$ gem list tesseract-ocr -d      
*** LOCAL GEMS ***

tesseract-ocr (0.1.8)
    Author: meh.
    Homepage: http://github.com/meh/ruby-tesseract-ocr
    License: BSD
    Installed at: /home/noraj/.gem/ruby/2.3.0

    A wrapper library to the tesseract-ocr API.

tessaract :

$ tesseract -v                                     
tesseract 3.04.01
 leptonica-1.73
  libgif 5.1.2 : libjpeg 8d (libjpeg-turbo 1.4.2) : libpng 1.6.25 : libtiff 4.0.6 : zlib 1.2.8 : libwebp 0.5.1

ruby :

$ ruby -v
ruby 2.3.1p112 (2016-04-26 revision 54768) [x86_64-linux]

It's not clear if I need to import tesseract or tesseract-ocr in my ruby ?

irb(main):002:0> require 'tesseract'
CompilationError: compile error: see logs at /tmp/.ffi-inline-1000/81b6fb2baace695a88ac35bc54fcc39bf2dc1e42.log
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/compilers/gcc.rb:35:in `compile'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders/c.rb:114:in `shared_object'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:90:in `block in build'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:87:in `instance_eval'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:87:in `build'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/inline.rb:54:in `singleton_inline'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/inline.rb:39:in `inline'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:30:in `<module:BaseAPI>'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:27:in `<module:C>'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:25:in `<module:Tesseract>'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:25:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c.rb:89:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/api.rb:26:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract-ocr.rb:35:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract.rb:25:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:127:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:127:in `rescue in require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:40:in `require'
    from (irb):2
    from /usr/bin/irb:11:in `<main>'
irb(main):003:0> require 'tesseract-ocr'
CompilationError: compile error: see logs at /tmp/.ffi-inline-1000/81b6fb2baace695a88ac35bc54fcc39bf2dc1e42.log
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/compilers/gcc.rb:35:in `compile'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders/c.rb:114:in `shared_object'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:90:in `block in build'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:87:in `instance_eval'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/builders.rb:87:in `build'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/inline.rb:54:in `singleton_inline'
    from /home/noraj/.gem/ruby/2.3.0/gems/ffi-inline-0.0.4.3/lib/ffi/inline/inline.rb:39:in `inline'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:30:in `<module:BaseAPI>'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:27:in `<module:C>'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:25:in `<module:Tesseract>'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:25:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/c.rb:89:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract/api.rb:26:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /home/noraj/.gem/ruby/2.3.0/gems/tesseract-ocr-0.1.8/lib/tesseract-ocr.rb:35:in `<top (required)>'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
    from (irb):3
    from /usr/bin/irb:11:in `<main>'

It's also not clear if tesseract (distribution package for exemple) is needed for the tesseract-ocr ruby gem ? It's even not clear if tesseract ruby gem is needed for tesseract-ocr ruby gem ?

HERE is a full ffi-inline error log file.

noraj avatar Oct 22 '16 19:10 noraj

This is still an issue:

tesseract 3.04.01 leptonica-1.74 libjpeg 8d (libjpeg-turbo 1.5.0) : libpng 1.6.25 : libtiff 4.0.6 : zlib 1.2.8

*** LOCAL GEMS *** tesseract-ocr (0.1.8)

Error: In file included from /tmp/.ffi-inline-1001/23d9ec096cb66aab370b5806b2d0cd5148975a4e.cpp:1:0: /usr/include/tesseract/baseapi.h:356:8: note: initializing argument 1 of ‘void tesseract::TessBaseAPI::SetImage(Pix*)’ void SetImage(Pix* pix); ^~~~~~~~ /home/john/.rvm/gems/ruby-2.3.1/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb: In function ‘bool process_pages(tesseract::TessBaseAPI*, const char*, STRING*)’: /home/john/.rvm/gems/ruby-2.3.1/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:183:55: error: no matching function for call to ‘tesseract::TessBaseAPI::ProcessPages(const char*&, NULL, int, STRING*&)’ return api->ProcessPages(filename, NULL, 0, output); ^ In file included from /tmp/.ffi-inline-1001/23d9ec096cb66aab370b5806b2d0cd5148975a4e.cpp:1:0: /usr/include/tesseract/baseapi.h:541:8: note: candidate: bool tesseract::TessBaseAPI::ProcessPages(const char*, const char*, int, tesseract::TessResultRenderer*) bool ProcessPages(const char* filename, const char* retry_config, ^~~~~~~~~~~~ /usr/include/tesseract/baseapi.h:541:8: note: no known conversion for argument 4 from ‘STRING*’ to ‘tesseract::TessResultRenderer*’ /home/john/.rvm/gems/ruby-2.3.1/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb: In function ‘bool process_page(tesseract::TessBaseAPI*, Pix*, int, const char*, STRING*)’: /home/john/.rvm/gems/ruby-2.3.1/gems/tesseract-ocr-0.1.8/lib/tesseract/c/baseapi.rb:189:71: error: no matching function for call to ‘tesseract::TessBaseAPI::ProcessPage(Pix*&, int&, const char*&, NULL, int, STRING*&)’ return api->ProcessPage(pix, page_index, filename, NULL, 0, output); ^ In file included from /tmp/.ffi-inline-1001/23d9ec096cb66aab370b5806b2d0cd5148975a4e.cpp:1:0: /usr/include/tesseract/baseapi.h:556:8: note: candidate: bool tesseract::TessBaseAPI::ProcessPage(Pix*, int, const char*, const char*, int, tesseract::TessResultRenderer*) bool ProcessPage(Pix* pix, int page_index, const char* filename, ^~~~~~~~~~~ /usr/include/tesseract/baseapi.h:556:8: note: no known conversion for argument 6 from ‘STRING*’ to ‘tesseract::TessResultRenderer*’

jef-abraham avatar Jun 03 '17 21:06 jef-abraham

I tried using this

brew install https://raw.githubusercontent.com/Homebrew/homebrew/8ba134eda537d2cee7daa7ebdd9f728389d9c53e/Library/Formula/tesseract.rb

to install a downgraded version of Tesseract on my Mac. However, I get the following error

Error: Calling Resource#sha1 is disabled! Use Resource#sha256 instead. /Users/maheshmesta/Library/Caches/Homebrew/Formula/tesseract.rb:123:in `block (2 levels) in class:Tesseract'

How do I rectify this issue?

Mahesh8 avatar Aug 28 '17 10:08 Mahesh8

@Mahesh8 Tried the same, getting nowhere so far

tjaklitsch avatar Sep 04 '17 12:09 tjaklitsch

After fiddling around for a while I came up with a solution. I've modified the file to use sha256 and also update the broken links in the file.

  • First, download the following file and save it as Tesseract.rb: https://gist.github.com/arcticbarra/631bf0fee3c7eacc2c8b1e7b70e3e85d
  • Uninstall current version of tesseract with brew uninstall tesseract
  • cd to the directory and then run brew install Tesseract.rb

enriquebrgn avatar Sep 04 '17 17:09 enriquebrgn

This fix doesn't seem to be working anymore - is there a current workaround?

jpperlm avatar Dec 20 '17 04:12 jpperlm

It doesn't work because it has some outdated homebrew terminology. I commented out a few lines and was able to install Tesseract 3.0.2 and make this lib work!

For anyone looking for which lines to comment out: https://gist.github.com/zachfeldman/bfc7bac4543d466e9c096d585e373fbf

zachfeldman avatar Feb 25 '18 03:02 zachfeldman

thank you @zachfeldman -- with the file above I'm getting

Error: Tesseract: Calling `sha256 "digest" => :tag` in a bottle block is disabled! Use `brew style --fix` on the formula to update the style or use `sha256 tag: "digest"` instead.

I wonder where this is coming from because I don't see any such syntax in your Tesseract.rb (from gist) above

jasonfb avatar Oct 15 '21 11:10 jasonfb

There are 2 weird download links in the script mentioned in the solution here https://github.com/meh/ruby-tesseract-ocr/issues/50#issuecomment-327005723 which I don't trust. My main concern is with the GoogleDrive link (which is now also broken as well). Therefore issue still present for me.

Hyperadministrator avatar Jan 21 '23 15:01 Hyperadministrator