lab icon indicating copy to clipboard operation
lab copied to clipboard

add guide for tesseract

Open jonmz opened this issue 5 years ago • 3 comments

For a support case, I tried to install tesseract on a U7 account based on https://github.com/tesseract-ocr/tesseract/wiki/Compiling#install-elsewhere--without-root simply to find out if it works. Quickly sharing my results:

  1. Building leptonica (dependency):
$ curl -L https://github.com/DanBloomberg/leptonica/releases/download/1.79.0/leptonica-1.79.0.tar.gz | tar -xzvf -

$ cd leptonica-1.79.0

$ ./configure --prefix=$HOME/local/

$ make

$ make install
  1. Building tesseract:
$ curl -L https://github.com/tesseract-ocr/tesseract/archive/4.1.1.tar.gz | tar -xzvf -

$ cd tesseract-4.1.1

$ ./autogen.sh

$ export PKG_CONFIG_PATH=$HOME/local/lib/pkgconfig

$ LIBLEPT_HEADERSDIR=$HOME/local/include ./configure --prefix=$HOME/local/ --with-extra-libraries=$HOME/local/lib

$ make

$ make install
  1. Voilà:
$ ~/local/bin/tesseract --version
tesseract 4.1.1
 leptonica-1.79.0
  libjpeg 6b (libjpeg-turbo 1.2.90) : libpng 1.5.13 : libtiff 4.0.3 : zlib 1.2.7 : libwebp 1.0.3 : libopenjp2 2.3.1
 Found AVX2
 Found AVX
 Found FMA
 Found SSE

If anybody would want to create a guide, this might be a jumpstart.

jonmz avatar Jan 16 '20 16:01 jonmz

@jonmz thanks for the guide. Unfortunately it errors out for me at the make step for tesseract

/opt/rh/devtoolset-9/root/usr/libexec/gcc/x86_64-redhat-linux/9/ld: cannot find -lbrotlidec
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:3343: libtesseract.la] Error 1
make[1]: Leaving directory '/home/my_user_name/app/tesseract-5.3.3'
make: *** [Makefile:7836: all-recursive] Error 1

I did use higher version numbers (leptonica 1.83.1 & tesseract 5.3.3), maybe that's the cause...

tipa avatar Oct 19 '23 14:10 tipa

I assume that this is a change within Tesseract 5.0 which added support for running OCR on URL-based images: https://github.com/tesseract-ocr/tesseract/commit/286d8275c783062057d09bb8e5e6607a8917abd9 This requires CURL, which requires Brotli. You might build without CURL support (--with-curl=no) or need to provide the Brotli header files as well.

FriedrichFroebel avatar Oct 19 '23 14:10 FriedrichFroebel

That seems to have worked - sweet! Thanks so much!

tipa avatar Oct 19 '23 15:10 tipa