tesseract
tesseract copied to clipboard
Error on creating TesseractEngine
Hi!
I'm starting with Tesseract and have this code, that runs very well on Windows:
using (var engine = new TesseractEngine("tessdata", "por"))
{
var image = Pix.LoadFromFile(filePath);
var page = engine.Process(image);
text = page.GetText();
}
But I need run this one on Linux, most specifically on Mint distribution, and I use in this form:
using (var engine = new TesseractEngine("./tessdata", "por"))
{
var image = Pix.LoadFromFile(filePath);
var page = engine.Process(image);
text = page.GetText();
}
And I receive this error in the 'new TesseractEngine' line:
Unhandled exception. System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.ArgumentNullException: Value cannot be null. (Parameter 'path1') at System.ArgumentNullException.Throw(String paramName) at System.IO.Path.Combine(String path1, String path2) at InteropDotNet.LibraryLoader.InternalLoadLibrary(String baseDirectory, String platformName, String fileName) at InteropDotNet.LibraryLoader.CheckExecutingAssemblyDomain(String fileName, String platformName) at InteropDotNet.LibraryLoader.LoadLibrary(String fileName, String platformName) at InteropRuntimeImplementer.LeptonicaApiSignaturesInstance.LeptonicaApiSignaturesImplementation..ctor(LibraryLoader loader) at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor) at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span
1 copyOfArgs, BindingFlags invokeAttr) --- End of inner exception stack trace --- at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span1 copyOfArgs, BindingFlags invokeAttr) at System.Reflection.MethodBaseInvoker.InvokeWithOneArg(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture) at System.RuntimeType.CreateInstanceImpl(BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture) at InteropDotNet.InteropRuntimeImplementer.CreateInstanceT at Tesseract.Interop.LeptonicaApi.Initialize() at Tesseract.Interop.TessApi.Initialize() at Tesseract.Interop.TessApi.get_Native() at Tesseract.TesseractEngine..ctor(String datapath, String language, EngineMode engineMode, IEnumerable1 configFiles, IDictionary2 initialOptions, Boolean setOnlyNonDebugVariables) at Tesseract.TesseractEngine..ctor(String datapath, String language) at TesteConversaoPdfParaImagem.Program.ReadImage(String prefix, String filePath, String resultFileName, Boolean isSingleblock) in C:\Users\carlo\source\repos\TesteConversaoPdfParaImagem\TesteConversaoPdfParaImagem\Program.cs:line 127 at TesteConversaoPdfParaImagem.Program.Main(String[] args) in C:\Users\carlo\source\repos\TesteConversaoPdfParaImagem\TesteConversaoPdfParaImagem\Program.cs:line 53
Do you help me with this, please?
Thanks a lot, guys! :)
System.ArgumentNullException: Value cannot be null. (Parameter 'path1') at System.ArgumentNullException.Throw(String paramName) at System.IO.Path.Combine(String path1, String path2)
I am not expert but, its definitely about your path :D. I would Console.WriteLine(filePath); to see what is result of your Path.Combine. Probably you have null there.
Yesterday I discovered the same error on the project that I just started. When I debug from Visual Studio there is no issue. When I publish and run it somewhere else, I get the same error message.
I discovered that I get the error when I enable the checkbox "Produce single file". When it's enabled I get the error. When it's disabled, I don't get the error.
In both situations the tessdata folder is on the same location (on the same level as my console executable). Also in both situations I confirmed my code was pointing to the right location by adding a console message:
string tesseractDataPath = Path.Combine(AppContext.BaseDirectory, "tessdata");
Console.WriteLine($"Tesseract data folder: {tesseractDataPath}");
using var ocrEngine = new TesseractEngine(tesseractDataPath, "nld+eng", EngineMode.Default);
Hi guys!
I found the error. It's happen on the class 'InteropDotNet.LibraryLoader', at the line 86:
var baseDirectory = Path.GetDirectoryName(executingAssembly.Location);
The code executingAssembly.Location return null and it's the cause of the crash. I created a handler class called 'EnvironmentUtils.cs' in 'Tesseract.Internal' with this code:
` using System; using System.IO; using System.Reflection;
namespace Tesseract.Internal { internal static class EnvironmentUtils { public static string AppPath(Assembly assembly) { string appPath = Path.GetDirectoryName(Path.GetDirectoryName(assembly?.Location));
if (!string.IsNullOrWhiteSpace(appPath))
return appPath;
return AppPath();
}
public static string AppPath()
{
string appPath;
appPath = Directory.GetCurrentDirectory();
if (!string.IsNullOrWhiteSpace(appPath))
return appPath;
appPath = Environment.CurrentDirectory;
if (!string.IsNullOrWhiteSpace(appPath))
return appPath;
appPath = Path.Combine(AppContext.BaseDirectory, "tessdata");
if (!string.IsNullOrWhiteSpace(appPath))
return appPath;
appPath = Path.GetDirectoryName(Path.GetDirectoryName(Assembly.GetEntryAssembly()?.Location));
if (!string.IsNullOrWhiteSpace(appPath))
return appPath;
appPath = AppDomain.CurrentDomain.BaseDirectory;
if (!string.IsNullOrWhiteSpace(appPath))
return appPath;
throw new ArgumentNullException("Application path not found");
}
}
} `
And the problem has solved. But after this, the error change to this:
Unhandled exception. System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.DllNotFoundException: Failed to find library "libleptonica-1.82.0.so" for platform x64. at InteropDotNet.LibraryLoader.LoadLibrary(String fileName, String platformName) in D:\git\tesseract\src\Tesseract\Internal\InteropDotNet\LibraryLoader.cs:line 57 at InteropRuntimeImplementer.LeptonicaApiSignaturesInstance.LeptonicaApiSignaturesImplementation..ctor(LibraryLoader loader) at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor) at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span1 copyOfArgs, BindingFlags invokeAttr)
--- End of inner exception stack trace ---
at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span1 copyOfArgs, BindingFlags invokeAttr) at System.Reflection.MethodBaseInvoker.InvokeWithOneArg(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture) at System.RuntimeType.CreateInstanceImpl(BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture) at InteropDotNet.InteropRuntimeImplementer.CreateInstance[T]() in D:\git\tesseract\src\Tesseract\Internal\InteropDotNet\InteropRuntimeImplementer.cs:line 45 at Tesseract.Interop.LeptonicaApi.Initialize() in D:\git\tesseract\src\Tesseract\Interop\LeptonicaApi.cs:line 563 at Tesseract.Interop.TessApi.Initialize() in D:\git\tesseract\src\Tesseract\Interop\BaseApi.cs:line 583 at Tesseract.Interop.TessApi.get_Native() in D:\git\tesseract\src\Tesseract\Interop\BaseApi.cs:line 372 at Tesseract.TesseractEngine..ctor(String datapath, String language, EngineMode engineMode, IEnumerable1 configFiles, IDictionary2 initialOptions, Boolean setOnlyNonDebugVariables) in D:\git\tesseract\src\Tesseract\TesseractEngine.cs:line 181 at Tesseract.TesseractEngine..ctor(String datapath, String language) in D:\git\tesseract\src\Tesseract\TesseractEngine.cs:line 37 at TesteConversaoPdfParaImagem.Program.ReadImage(String prefix, String filePath, String resultFileName, Boolean isSingleblock) in C:\Users\carlo\source\repos\TesteConversaoPdfParaImagem\TesteConversaoPdfParaImagem\Program.cs:line 147 at TesteConversaoPdfParaImagem.Program.Extract(String pdfName, String readedFilePath) in C:\Users\carlo\source\repos\TesteConversaoPdfParaImagem\TesteConversaoPdfParaImagem\Program.cs:line 68 at TesteConversaoPdfParaImagem.Program.Main(String[] args) in C:\Users\carlo\source\repos\TesteConversaoPdfParaImagem\TesteConversaoPdfParaImagem\Program.cs:line 14
In the next week I expect return to see this new problem.
Thank's for your help :)
Thanks, I'll see if I can find some time this evening to put in a fix. If I don't, feel free to create a push request with fix and I'll merge it in.
Sorry haven't been all that active lately, pretty much all my free time is taken up with home Reno's/repairs.
On Sun, 30 June 2024, 4:53 am Carlos Felippe Vernizze, < @.***> wrote:
Hi guys!
I found the error. It's happen on the class 'InteropDotNet.LibraryLoader', at the line 86:
var baseDirectory = Path.GetDirectoryName(executingAssembly.Location);
The code executingAssembly.Location return null and it's the cause of the crash. I created a handler class called 'EnvironmentUtils.cs' in 'Tesseract.Internal' with this code:
` using System; using System.IO; using System.Reflection;
namespace Tesseract.Internal { internal static class EnvironmentUtils { public static string AppPath(Assembly assembly) { string appPath = Path.GetDirectoryName(Path.GetDirectoryName(assembly?.Location));
if (!string.IsNullOrWhiteSpace(appPath)) return appPath; return AppPath(); } public static string AppPath() { string appPath; appPath = Directory.GetCurrentDirectory(); if (!string.IsNullOrWhiteSpace(appPath)) return appPath; appPath = Environment.CurrentDirectory; if (!string.IsNullOrWhiteSpace(appPath)) return appPath; appPath = Path.Combine(AppContext.BaseDirectory, "tessdata"); if (!string.IsNullOrWhiteSpace(appPath)) return appPath; appPath = Path.GetDirectoryName(Path.GetDirectoryName(Assembly.GetEntryAssembly()?.Location)); if (!string.IsNullOrWhiteSpace(appPath)) return appPath; appPath = AppDomain.CurrentDomain.BaseDirectory; if (!string.IsNullOrWhiteSpace(appPath)) return appPath; throw new ArgumentNullException("Application path not found"); }}
} `
And the problem has solved. But after this, the error change to this:
Unhandled exception. System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.DllNotFoundException: Failed to find library " libleptonica-1.82.0.so" for platform x64. at InteropDotNet.LibraryLoader.LoadLibrary(String fileName, String platformName) in D:\git\tesseract\src\Tesseract\Internal\InteropDotNet\LibraryLoader.cs:line 57 at InteropRuntimeImplementer.LeptonicaApiSignaturesInstance.LeptonicaApiSignaturesImplementation..ctor(LibraryLoader loader) at System.RuntimeMethodHandle.InvokeMethod(Object target, Void** arguments, Signature sig, Boolean isConstructor) at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span1 copyOfArgs, BindingFlags invokeAttr) --- End of inner exception stack trace --- at System.Reflection.MethodBaseInvoker.InvokeDirectByRefWithFewArgs(Object obj, Span1 copyOfArgs, BindingFlags invokeAttr) at System.Reflection.MethodBaseInvoker.InvokeWithOneArg(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture) at System.RuntimeType.CreateInstanceImpl(BindingFlags bindingAttr, Binder binder, Object[] args, CultureInfo culture) at InteropDotNet.InteropRuntimeImplementer.CreateInstanceT in D:\git\tesseract\src\Tesseract\Internal\InteropDotNet\InteropRuntimeImplementer.cs:line 45 at Tesseract.Interop.LeptonicaApi.Initialize() in D:\git\tesseract\src\Tesseract\Interop\LeptonicaApi.cs:line 563 at Tesseract.Interop.TessApi.Initialize() in D:\git\tesseract\src\Tesseract\Interop\BaseApi.cs:line 583 at Tesseract.Interop.TessApi.get_Native() in D:\git\tesseract\src\Tesseract\Interop\BaseApi.cs:line 372 at Tesseract.TesseractEngine..ctor(String datapath, String language, EngineMode engineMode, IEnumerable1 configFiles, IDictionary2 initialOptions, Boolean setOnlyNonDebugVariables) in D:\git\tesseract\src\Tesseract\TesseractEngine.cs:line 181 at Tesseract.TesseractEngine..ctor(String datapath, String language) in D:\git\tesseract\src\Tesseract\TesseractEngine.cs:line 37 at TesteConversaoPdfParaImagem.Program.ReadImage(String prefix, String filePath, String resultFileName, Boolean isSingleblock) in C:\Users\carlo\source\repos\TesteConversaoPdfParaImagem\TesteConversaoPdfParaImagem\Program.cs:line 147 at TesteConversaoPdfParaImagem.Program.Extract(String pdfName, String readedFilePath) in C:\Users\carlo\source\repos\TesteConversaoPdfParaImagem\TesteConversaoPdfParaImagem\Program.cs:line 68 at TesteConversaoPdfParaImagem.Program.Main(String[] args) in C:\Users\carlo\source\repos\TesteConversaoPdfParaImagem\TesteConversaoPdfParaImagem\Program.cs:line 14
In the next week I expect return to see this new problem.
Thank's for your help :)
— Reply to this email directly, view it on GitHub https://github.com/charlesw/tesseract/issues/673#issuecomment-2198300967, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB7HSCU24LUJLZFIBIQMADZJ37CFAVCNFSM6AAAAABJ6R2EHKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOJYGMYDAOJWG4 . You are receiving this because you are subscribed to this thread.Message ID: @.***>
Was this ever fixed? I seem to be facing the same error
I'm seeing the exact same issues, when I publish to a single file it doesn't work and my application crashes with the same exception, however a Debug works fine. I did some further investigating, and this looks to be the same issue as described in #609. This issue looks to have been fixed with #657, but this PR is still open.
I have been testing with the proposed changes, and I can confirm this solves my issues. Perhaps @charlesw can have another look at this?
Same issue here building using a WAP project, logging confirms the datapath is correctly set, so seems to be a bug.
when I enable the checkbox "Produce single file", I get the error.