tesseract
tesseract copied to clipboard
Tesseract engine fails to initialize if run from webserver
I've created a class library that has a reference to Tesseract and I have a Parser class that has a function to lift a document. This works fine from a Commandline App. If however I reference my class library from a webservice project I am told that the Tesseract engine failed to initialize.
I have language resources available and the code does work from Commandline.
I instantiate the engine by this line and it throws the exception
TesseractEngine engine = new TesseractEngine(@"./tessdata", language.ToString(), EngineMode.Default);
language is an enum and the current value is 'dan' for which I have the Danish language pack available.
Tesseract.TesseractException
HResult=0x80131500
Message=Failed to initialise tesseract engine.. See https://github.com/charlesw/tesseract/wiki/Error-1 for details.
Source=Tesseract
StackTrace:
at Tesseract.TesseractEngine.Initialise(String datapath, String language, EngineMode engineMode, IEnumerable1 configFiles, IDictionary
2 initialValues, Boolean setOnlyNonDebugVariables)
at Tesseract.TesseractEngine..ctor(String datapath, String language, EngineMode engineMode, IEnumerable1 configFiles, IDictionary
2 initialOptions, Boolean setOnlyNonDebugVariables)
at Tesseract.TesseractEngine..ctor(String datapath, String language, EngineMode engineMode)
at Inventio.OCR.Parser.LiftDocument(Byte[] document, Language language) in C:\Development\Projects\Inventio OCR\Inventio.OCR\Parser.cs:line 23
at OCR_WebService.Controllers.OcrController.Post() in C:\Development\Projects\Inventio OCR\OCR WebService\Controllers\OcrController.cs:line 29
at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.<>c__DisplayClass6_2.<GetExecutor>b__2(Object instance, Object[] methodParameters)
at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.Execute(Object instance, Object[] arguments)
at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ExecuteAsync(HttpControllerContext controllerContext, IDictionary`2 arguments, CancellationToken cancellationToken)
@"./tessdata" path did not work for webservices. So I've added a binPath parameter to my function and calculate the path to bin folder this way
string binPath = HttpContext.Current.Server.MapPath("..") + @"\bin";
Works locally.
I'm pretty sure it's because the working directory isn't what you think it is and therefore it can't find the language data.
Solution is normally to resolve an absolute path and then use that. I'm pretty sure the asp.net MVC demo does that.
On Thu, 9 Sep 2021, 21:17 HenrikHolmIT, @.***> wrote:
I've created a class library that has a reference to Tesseract and I have a Parser class that has a function to lift a document. This works fine from a Commandline App. If however I reference my class library from a webservice project I am told that the Tesseract engine failed to initialize.
I have language resources available and the code does work from Commandline.
I instantiate the engine by this line and it throws the exception
TesseractEngine engine = new TesseractEngine(@"./tessdata", language.ToString(), EngineMode.Default);
language is an enum and the current value is 'dan' for which I have the Danish language pack available.
Tesseract.TesseractException HResult=0x80131500 Message=Failed to initialise tesseract engine.. See https://github.com/charlesw/tesseract/wiki/Error-1 for details. Source=Tesseract StackTrace: at Tesseract.TesseractEngine.Initialise(String datapath, String language, EngineMode engineMode, IEnumerable1 configFiles, IDictionary2 initialValues, Boolean setOnlyNonDebugVariables) at Tesseract.TesseractEngine..ctor(String datapath, String language, EngineMode engineMode, IEnumerable1 configFiles, IDictionary2 initialOptions, Boolean setOnlyNonDebugVariables) at Tesseract.TesseractEngine..ctor(String datapath, String language, EngineMode engineMode) at Inventio.OCR.Parser.LiftDocument(Byte[] document, Language language) in C:\Development\Projects\Inventio OCR\Inventio.OCR\Parser.cs:line 23 at OCR_WebService.Controllers.OcrController.Post() in C:\Development\Projects\Inventio OCR\OCR WebService\Controllers\OcrController.cs:line 29 at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.<>c__DisplayClass6_2.b__2(Object instance, Object[] methodParameters) at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.Execute(Object instance, Object[] arguments) at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ExecuteAsync(HttpControllerContext controllerContext, IDictionary`2 arguments, CancellationToken cancellationToken)
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/charlesw/tesseract/issues/573, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB7HSC32GWMULEUDRRQWADUBCJVBANCNFSM5DW4SKJA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.
@charlesw That is true. I now set the bin path in the client code. If commandline tesseract is initialized with 'Path.GetDirectoryName( Assembly.GetExecutingAssembly().Location)' and if webservice 'HttpContext.Current.Server.MapPath("..")'.