tesseract icon indicating copy to clipboard operation
tesseract copied to clipboard

Linux support missing for .NET Core

Open arthrp opened this issue 5 years ago • 48 comments

When trying to run minimal application on linux, exception is thrown System.DllNotFoundException: Failed to find library "libleptonica-1.78.0.so" for platform x64 So native dependencies aren't included for any platforms other than Windows it seems.

arthrp avatar Jan 22 '20 14:01 arthrp

I'm also having this problem when trying to build in travis-ci with linux. Is there a package that can be installed that will bring this file down? I have tried both of the following neither of which helped

sudo apt install tesseract-ocr
sudo apt install libleptonica-dev

tznind avatar Jan 24 '20 15:01 tznind

I needed to compile this file myself in order to get it working. I used this docs: https://github.com/tesseract-ocr/tesseract/wiki/Compiling

ferronsw avatar Jan 27 '20 07:01 ferronsw

The problem is that InteropDotNet seems to expect native libleptonica and libtesseract in the according folder (x64 or x86). So if you're lazy as myself, you can install libtesseract-dev and libleptonica-dev, copy the *.so files to bin/< Configuration >/netcoreapp3.1/x64 folder, rename them (compiler will you what file is missing) and it suddenly works. Assuming your distro's software sources contain the proper versions, that is ;)

I think the appropriate solution though is to ship the binaries for Linux as it's done for Windows.

arthrp avatar Jan 27 '20 20:01 arthrp

I was thinking a better way of dealing with this would be to support falling back on just using the generic load dynamic library routine (i.e. standard search paths). That way it should just work provided the libraries been installed.

Only issue I can think of is it might cause issues with the diagnosing errors relating to finding or loading the library. Which has been a consistent pain in the ass.

How does that sound?

On Tue, 28 Jan 2020, 07:13 Arthur, [email protected] wrote:

The problem is that InteropDotNet https://github.com/AndreyAkinshin/InteropDotNet seems to expect native libleptonica and libtesseract in the according folder (x64 or x86). So if you're lazy as myself, you can install libtesseract-dev and libleptonica-dev, copy the *.so files to bin//x64 folder, rename them (compiler will you what file is missing) and it suddenly works.

I think the appropriate solution though is to ship the binaries for Linux as it's done for Windows.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/charlesw/tesseract/issues/503?email_source=notifications&email_token=AAB7HSFBQGRE4KQQXT532Q3Q7455HA5CNFSM4KKGHXW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEKA4R5A#issuecomment-578930932, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB7HSDQUHSMSOEPC36WOSDQ7455HANCNFSM4KKGHXWQ .

charlesw avatar Jan 28 '20 22:01 charlesw

I'm having the same issue, I have managed to complile the correct leptonica 1.78.0 and tesseract 4.1.0 because I'm using a Docker image based on Debian buster and it only goes to 1.76 and 4.0.1 but I still have the "System.DllNotFoundException: Failed to find library "libleptonica-1.78.0.so" for platform x64."

I tried copying the .so in the x64 folder as suggester by @arthrp and I also tried with symlinks but it does not work either, I don't know what else to do at this point.

Any idea?

cgrard avatar Feb 07 '20 15:02 cgrard

Not sure if this helps but @howff managed to get it working in our repo with this commit. It has some renames of the files in the x64 directory https://github.com/SMI/SmiServices/commit/d97879dd84c7288ca6e472c7f2b0f678bb9a47a3

tznind avatar Feb 07 '20 15:02 tznind

Thanks, at least it confirms that they need to be called x64\libleptonica-1.78.0.so and x64\libtesseract41.so. Oddly enough this is exactly what I do, I must be wrong somewhere else I guess...

cgrard avatar Feb 07 '20 15:02 cgrard

You will need to compile the linux .so for leptonica and tesseract and place them in the x64 directory. Make sure the filenames are x64/libtesseract41.so and x64/libleptonica-1.78.0.so so that the DLLs can find them, BUT you might find that internally the tesseract .so references the leptonica .so by a different name so you might need a symbolic link (or copy) named x64/libleptonica.so.5.3.0

You can use the linux 'ldd' utility to find out if there are any other dependencies missing.

howff avatar Feb 07 '20 15:02 howff

Thanks a lot for trying to help me with this, basically, in my Docker image I have this:

root@st-ocr_service:/usr/local/lib# ls -alh
total 145M
drwxr-xr-x 1 root root 4.0K Feb  7 15:51 .
drwxr-xr-x 1 root root 4.0K Feb  7 15:51 ..
-rw-r--r-- 1 root root  17M Feb  7 14:14 liblept.a
-rwxr-xr-x 1 root root  984 Feb  7 14:14 liblept.la
lrwxrwxrwx 1 root root   16 Feb  7 14:14 liblept.so -> liblept.so.5.0.3
lrwxrwxrwx 1 root root   16 Feb  7 14:14 liblept.so.5 -> liblept.so.5.0.3
-rwxr-xr-x 1 root root 2.6M Feb  7 14:14 liblept.so.5.0.3
-rw-r--r-- 1 root root 123M Feb  7 14:19 libtesseract.a
-rwxr-xr-x 1 root root 1.1K Feb  7 14:19 libtesseract.la
lrwxrwxrwx 1 root root   21 Feb  7 14:19 libtesseract.so -> libtesseract.so.4.0.1
lrwxrwxrwx 1 root root   21 Feb  7 14:19 libtesseract.so.4 -> libtesseract.so.4.0.1
-rwxr-xr-x 1 root root 3.4M Feb  7 14:19 libtesseract.so.4.0.1
drwxr-xr-x 2 root root 4.0K Feb  7 15:51 pkgconfig

And this:

root@st-ocr_service:/app/x64# ls -alh
total 12K
drwxr-xr-x 1 root root 4.0K Feb  7 16:08 .
drwxr-xr-x 1 root root 4.0K Feb  7 15:51 ..
lrwxrwxrwx 1 root root   27 Feb  7 16:03 liblept.so.5 -> /usr/local/lib/liblept.so.5
lrwxrwxrwx 1 root root   27 Feb  7 16:08 libleptonica-1.78.0.so -> /usr/local/lib/liblept.so.5
lrwxrwxrwx 1 root root   36 Feb  7 16:08 libtesseract41.so -> /usr/local/lib/libtesseract.so.4.0.1

Which is consistent with what I got from ldd:

root@st-ocr_service:/app/x64# ldd libtesseract41.so 
	[...]
	liblept.so.5 => /usr/local/lib/liblept.so.5 (0x00007fe735ff6000)
	[...]

However, even with this configuration, I still get the System.DllNotFoundException: Failed to find library "libleptonica-1.78.0.so" for platform x64. which drives me crazy.

cgrard avatar Feb 07 '20 16:02 cgrard

Monday morning, fresh coffee, clear mind, I finally got this to work! My Leptonica and Tesseract compilation weren't correct, I fixed this, and there also was some missing dependencies in my Docker image, now everything is sorted out and working as it should!

cgrard avatar Feb 10 '20 14:02 cgrard

I am trying to compile in Windows 10, and deploy on AWS Lambda. I am getting the error on libleptonica-1.78.0.so file missing. Is there any place I can download it from?

sudipkumarbhattacharya avatar Apr 14 '20 11:04 sudipkumarbhattacharya

@sudipkumarbhattacharya The linux binaries are linked from one of the previous comments

howff avatar Apr 14 '20 12:04 howff

This is how I fixed it in a Debian 9 server:

cd ~/MyProject/x64

sudo apt install libleptonica-dev
ln -s /usr/lib/x86_64-linux-gnu/liblept.so.5 liblept.so.5
ln -s /usr/lib/x86_64-linux-gnu/liblept.so.5 libleptonica-1.78.0.so

# Installed the repository from https://notesalexp.org/
sudo apt install libtesseract-dev
ln -s /usr/lib/x86_64-linux-gnu/libtesseract.so.4.0.1 libtesseract41.so

The paths and versions might be different in your machine.

carloswbarros avatar Apr 14 '20 13:04 carloswbarros

Is anyone able to run it against .NET Core 5.0? I'm having issues to compile example console app on Manjaro Linux after updating dependencies and target framework.

leptopnica is OK, libteseract is OK, but the result Console App even tho it compiles, it crashes with segmentation fault when I run it. Something makes me think that its this .NET wrapper is not working correctly under linux.

dimon222 avatar Feb 10 '21 05:02 dimon222

Wouldn't it make more sense to just bundle the native dependencies in the package? There are many .net packages that do this rather than requiring that people have exactly the right libs installed (or perform manual steps to copy in those dependencies).

atrauzzi avatar Mar 15 '21 15:03 atrauzzi

Yes that's certainly an option, does anyone else have any opinions here? I don't have much experience with Linux so not sure of current best practices here.

Note I don't have the time to maintain Linux/Mac however will accept pull requests. Please do raise an issue, or comment on existing issue if appropriate, to discuss your ideas first though 🙂

On Tue, 16 Mar 2021, 02:49 Alexander Trauzzi, @.***> wrote:

Wouldn't it make more sense to just bundle the native dependencies in the package? There are many .net packages that do this rather than requiring that people have exactly the right libs installed (or perform manual steps to copy in those dependencies).

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/charlesw/tesseract/issues/503#issuecomment-799527550, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB7HSF4QBJ3BHIGK62JFODTDYUAFANCNFSM4KKGHXWQ .

charlesw avatar Mar 15 '21 19:03 charlesw

I think the process would be the same for Windows, macOS and Linux. The Windows library should also be including a static copy too, so maybe start there?

atrauzzi avatar Mar 15 '21 22:03 atrauzzi

I use wrapper for .NET 4.1.1 and it worked on WSL2 (Ubuntu 20.04) using solution by carloswbarros . But deploing on Docker on AWS it fails with message Failed to find library "libleptonica-1.80.0.so" for platform x64. or sometimes with Failed to find library "libtesseract41.so" for platform x64. I tryied symlinks like WORKDIR /app/x64 RUN ln -s /usr/lib/x86_64-linux-gnu/liblept.so.5 libleptonica-1.80.0.so RUN ln -s /usr/lib/x86_64-linux-gnu/libtesseract.so.4.0.1 libtesseract41.so an it does not help. Also I tried to copy to x64 folder directly from solution folder and same result. Now I think that I have to compile it may be. Actually it is a good idea to have linux library with nuget package installed.

PaulEsprit avatar Apr 12 '21 15:04 PaulEsprit

I use wrapper for .NET 4.1.1 and it worked on WSL2 (Ubuntu 20.04) using solution by carloswbarros . But deploing on Docker on AWS it fails with message Failed to find library "libleptonica-1.80.0.so" for platform x64. or sometimes with Failed to find library "libtesseract41.so" for platform x64. I tryied symlinks like WORKDIR /app/x64 RUN ln -s /usr/lib/x86_64-linux-gnu/liblept.so.5 libleptonica-1.80.0.so RUN ln -s /usr/lib/x86_64-linux-gnu/libtesseract.so.4.0.1 libtesseract41.so an it does not help. Also I tried to copy to x64 folder directly from solution folder and same result. Now I think that I have to compile it may be. Actually it is a good idea to have linux library with nuget package installed.

also have this error when I run github workflow and docker instation

MRmlik12 avatar Apr 14 '21 19:04 MRmlik12

This worked fo me in Dockerfile. We had to build libtesseract.so.4.0.1 from source RUN apt-get update && apt-get install -y libleptonica-dev RUN apt-get update && apt-get install -y libtesseract-dev RUN apt-get update && apt-get install -y libc6-dev RUN apt-get update && apt-get install -y libjpeg62-turbo-dev RUN apt-get update && apt-get install -y libgdiplus

WORKDIR /app/x64 RUN ln -s /usr/lib/x86_64-linux-gnu/liblept.so.5 liblept.so.5 RUN ln -s /usr/lib/x86_64-linux-gnu/liblept.so.5 libleptonica-1.80.0.so COPY ./sofile/libtesseract.so.4.0.1 libtesseract41.so

PaulEsprit avatar Apr 16 '21 18:04 PaulEsprit

I added the installation of libleptonica-dev and libtesseract-dev and continue to receive the error "Failed to find library" libleptonica-1.80.0.so "for platform x64.". My dockerfile file used with core 3.1

FROM mcr.microsoft.com/dotnet/runtime:3.1 AS base
WORKDIR /app

WORKDIR /app/x64

RUN apt update && apt install libgif7 libjpeg62 libopenjp2-7 libpng16-16 libtiff5 libwebp6 libc6-dev libgdiplus libleptonica-dev libtesseract-dev -y

RUN ln -s /usr/lib/x86_64-linux-gnu/liblept.so.5 liblept.so.5
RUN ln -s /usr/lib/x86_64-linux-gnu/libleptonica-1.80.0 libleptonica-1.80.0.so
RUN ln -s /usr/lib/x86_64-linux-gnu/libtesseract.so.4.0.1 libtesseract41.so

FROM mcr.microsoft.com/dotnet/sdk:3.1 AS build
WORKDIR /src

COPY ["CrateHOCR/CrateHOCR.csproj", "CrateHOCR/"]
RUN dotnet restore "CrateHOCR/CrateHOCR.csproj"
COPY . .
WORKDIR "/src/CrateHOCR"
RUN dotnet build "CrateHOCR.csproj" -c Release -o /app/build

FROM build AS publish
RUN dotnet publish "CrateHOCR.csproj" -c Release -o /app/publish

FROM base AS final
WORKDIR /app

COPY --from=publish /app/publish .
ENTRYPOINT ["dotnet", "CrateHOCR.dll"]

Does anyone have any suggestions of where am I going wrong?

fernandovictorTI avatar May 10 '21 20:05 fernandovictorTI

Even with the libleptonica-1.80.0.so inside the x64 folder it still has an error

Files_LIB

fernandovictorTI avatar May 11 '21 14:05 fernandovictorTI

This worked fo me in Dockerfile. We had to build libtesseract.so.4.0.1 from source RUN apt-get update && apt-get install -y libleptonica-dev RUN apt-get update && apt-get install -y libtesseract-dev RUN apt-get update && apt-get install -y libc6-dev RUN apt-get update && apt-get install -y libjpeg62-turbo-dev RUN apt-get update && apt-get install -y libgdiplus

WORKDIR /app/x64 RUN ln -s /usr/lib/x86_64-linux-gnu/liblept.so.5 liblept.so.5 RUN ln -s /usr/lib/x86_64-linux-gnu/liblept.so.5 libleptonica-1.80.0.so COPY ./sofile/libtesseract.so.4.0.1 libtesseract41.so

It's very useful to me ,thanks

54Ezreal avatar Jun 16 '21 13:06 54Ezreal

I have the same problem as described here. I was able to work around it at least by pre-installing the mentioned packages and creating the symlinks.

Geisterli avatar Jun 28 '21 08:06 Geisterli

does anyone know where i can get libtesseract.so.4.0.1?

nmills3 avatar Jul 01 '21 20:07 nmills3

does anyone know where i can get libtesseract.so.4.0.1?

https://github.com/charlesw/tesseract/issues/503#issuecomment-613430901 As I said in this comment I used the repository from https://notesalexp.org/ and it worked

carloswbarros avatar Jul 02 '21 07:07 carloswbarros

I was hoping someone just had a download for it as i don't have anything to run ubuntu on currently

nmills3 avatar Jul 02 '21 07:07 nmills3

One of the previous comments has a link to pre-compiled libraries here: https://github.com/SMI/SmiServices/tree/master/lib and an explanation of how to rename them.

howff avatar Jul 02 '21 07:07 howff

thanks. i've added the file to my x64 directory and named it libtesseract41.so but i'm still getting the dll not found exception

On Fri, 2 Jul 2021 at 08:54, howff @.***> wrote:

One of the previous comments has a link to pre-compiled libraries here: https://github.com/SMI/SmiServices/tree/master/lib and an explanation of how to rename them.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/charlesw/tesseract/issues/503#issuecomment-872797622, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGML6N3UQTTJ5IAJBCOB25TTVVWDVANCNFSM4KKGHXWQ .

nmills3 avatar Jul 02 '21 08:07 nmills3

sorry if this is a stupid question, i'm not great at ubuntu. does this mean that the dllNotFoundException is because i'm missing a dependency?

# ldd libtesseract41.so
        linux-vdso.so.1 (0x00007ffd4a981000)
        libleptonica.so.5.3.0 => not found
        libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fccd841a000)
        libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fccd8296000)
        libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fccd8113000)
        libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fccd7f52000)
        libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fccd7f38000)
        libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007fccd7f05000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fccd898c000)
        libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fccd7f00000)

nmills3 avatar Jul 02 '21 08:07 nmills3