tesseract
tesseract copied to clipboard
centos 7.9 tesseract5.2 cmake failed! boxchar.cpp:65:75: error...
error INFO: /home/engine/wy/3rd/tesseract-5.2.0/src/training/pango/boxchar.cpp: In member function ‘void tesseract::BoxChar::GetDirection(int*, int*) const’: /home/engine/wy/3rd/tesseract-5.2.0/src/training/pango/boxchar.cpp:65:75: error: ‘U_RIGHT_TO_LEFT_ISOLATE’ was not declared in this scope; did you mean ‘U_RIGHT_TO_LEFT_OVERRIDE’? 65 | if (dir == U_RIGHT_TO_LEFT || dir == U_RIGHT_TO_LEFT_ARABIC || dir == U_RIGHT_TO_LEFT_ISOLATE) { | ^~~~~~~~~~~~~~~~~~~~~~~ | U_RIGHT_TO_LEFT_OVERRIDE
Environment
- Tesseract Version: 5.2.0
- Commit Number: tag 5.2.0
- Platform:centos7.9
error.log shows details error.log
Did you search the issue tracker before posting the issue? .e.g. https://github.com/tesseract-ocr/tesseract/issues/1374
The CMake build should do what the Autotools build is doing: check for icu >=52.1 and refuse to build the training tools if this requirement is not met.
@amitdo : it does: https://github.com/tesseract-ocr/tesseract/runs/7640401854?check_suite_focus=true#step:6:82
here is relevant part of check: https://github.com/tesseract-ocr/tesseract/blob/94b9ca4343743d38fbb635ca88e50621bc2d8beb/src/training/CMakeLists.txt#L71-L77
In provided log there is no info about ICU checks...
Thanks, Zdenko. I only looked at the MakeLists.txt located in the root directory. I forgot that there is another one in the training dir.
if(PKG_CONFIG_FOUND)
pkg_check_modules(ICU REQUIRED IMPORTED_TARGET icu-uc icu-i18n)
There is no version check here.
@amitdo : you miss the point: according reporter log there was not check for ICU. So putting there any version does not solve reporter problem. I wander how reporter managed to skip ICU presence. Plus ICU 52.1 was released 2013-10-09, so I really wonder if somebody is using older version than that...
Plus ICU 52.1 was released 2013-10-09, so I really wonder if somebody is using older version than that...
http://mirror.centos.org/centos/7.9.2009/os/x86_64/Packages/
libicu-50.2-4.el7_7.i686.rpm
So the solution is to use recent and well-maintained OS/distribution.
So the solution is to use recent and well-maintained OS/distribution.
hi, I use libicui18n.so( with yum install libicu-devel cmd), for tess check script shows error info: -- Checking for modules 'icu-uc;icu-i18n' -- No package 'icu-uc' found -- No package 'icu-i18n' found
after installing libicu-devel, pango-devel and cairo-devel, cmake -D CMAKE_INSTALL_PREFIX=/usr/local -D CMAKE_BUILD_TYPE=RELEASE -D BUILD_SHARED_LIBS=ON ..
then error occurred(Linux version 3.10.0-1160.71.1.el7.x86_64 ([email protected]) (gcc version 4.8.5 20150623 (Red Hat 4.8.5-44) (GCC) ) #1 SMP Tue Jun 28 15:37:28 UTC 2022).
I tried ubuntu(Linux version 5.4.0-122-generic (buildd@lcy02-amd64-035) (gcc version 7.5.0 (Ubuntu 7.5.0-3ubuntu1~18.04)) #138~18.04.1-Ubuntu SMP Fri Jun 24 14:14:03 UTC 2022), tess is ok
tks
You can try to build with autotools instead of CMake.
GCC 4.8 is not supported in Tesseract 5.x.
RHEL/Centos have newer GCC versions in their repos: http://mirror.centos.org/centos/7.9.2009/sclo/x86_64/rh/Packages/d/
So you can install GCC 11.
Do you plan to train your own model using Tesseract?
If not, ICU, Pango and Cairo are not required.
Autotools: Pango, Cairo and ICU only required by training tools
I don't know if CMake behave in the same way.
CMake behaves in the same way as Autotools. From the above information, I am sure that the reporter must modify CMake files to avoid these checks (ICU and GCC). So this is tesseract problem, but a user problem (try to compile tesseract on too with too old software)