tesseract icon indicating copy to clipboard operation
tesseract copied to clipboard

Tesseract engine fails to initialize if run from webserver

Open HenrikHolmIT opened this issue 3 years ago • 3 comments

I've created a class library that has a reference to Tesseract and I have a Parser class that has a function to lift a document. This works fine from a Commandline App. If however I reference my class library from a webservice project I am told that the Tesseract engine failed to initialize.

I have language resources available and the code does work from Commandline.

I instantiate the engine by this line and it throws the exception

TesseractEngine engine = new TesseractEngine(@"./tessdata", language.ToString(), EngineMode.Default);

language is an enum and the current value is 'dan' for which I have the Danish language pack available.

Tesseract.TesseractException HResult=0x80131500 Message=Failed to initialise tesseract engine.. See https://github.com/charlesw/tesseract/wiki/Error-1 for details. Source=Tesseract StackTrace: at Tesseract.TesseractEngine.Initialise(String datapath, String language, EngineMode engineMode, IEnumerable1 configFiles, IDictionary2 initialValues, Boolean setOnlyNonDebugVariables) at Tesseract.TesseractEngine..ctor(String datapath, String language, EngineMode engineMode, IEnumerable1 configFiles, IDictionary2 initialOptions, Boolean setOnlyNonDebugVariables) at Tesseract.TesseractEngine..ctor(String datapath, String language, EngineMode engineMode) at Inventio.OCR.Parser.LiftDocument(Byte[] document, Language language) in C:\Development\Projects\Inventio OCR\Inventio.OCR\Parser.cs:line 23 at OCR_WebService.Controllers.OcrController.Post() in C:\Development\Projects\Inventio OCR\OCR WebService\Controllers\OcrController.cs:line 29 at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.<>c__DisplayClass6_2.<GetExecutor>b__2(Object instance, Object[] methodParameters) at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.Execute(Object instance, Object[] arguments) at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ExecuteAsync(HttpControllerContext controllerContext, IDictionary`2 arguments, CancellationToken cancellationToken)

HenrikHolmIT avatar Sep 09 '21 11:09 HenrikHolmIT

@"./tessdata" path did not work for webservices. So I've added a binPath parameter to my function and calculate the path to bin folder this way

string binPath = HttpContext.Current.Server.MapPath("..") + @"\bin";

Works locally.

HenrikHolmIT avatar Sep 09 '21 12:09 HenrikHolmIT

I'm pretty sure it's because the working directory isn't what you think it is and therefore it can't find the language data.

Solution is normally to resolve an absolute path and then use that. I'm pretty sure the asp.net MVC demo does that.

On Thu, 9 Sep 2021, 21:17 HenrikHolmIT, @.***> wrote:

I've created a class library that has a reference to Tesseract and I have a Parser class that has a function to lift a document. This works fine from a Commandline App. If however I reference my class library from a webservice project I am told that the Tesseract engine failed to initialize.

I have language resources available and the code does work from Commandline.

I instantiate the engine by this line and it throws the exception

TesseractEngine engine = new TesseractEngine(@"./tessdata", language.ToString(), EngineMode.Default);

language is an enum and the current value is 'dan' for which I have the Danish language pack available.

Tesseract.TesseractException HResult=0x80131500 Message=Failed to initialise tesseract engine.. See https://github.com/charlesw/tesseract/wiki/Error-1 for details. Source=Tesseract StackTrace: at Tesseract.TesseractEngine.Initialise(String datapath, String language, EngineMode engineMode, IEnumerable1 configFiles, IDictionary2 initialValues, Boolean setOnlyNonDebugVariables) at Tesseract.TesseractEngine..ctor(String datapath, String language, EngineMode engineMode, IEnumerable1 configFiles, IDictionary2 initialOptions, Boolean setOnlyNonDebugVariables) at Tesseract.TesseractEngine..ctor(String datapath, String language, EngineMode engineMode) at Inventio.OCR.Parser.LiftDocument(Byte[] document, Language language) in C:\Development\Projects\Inventio OCR\Inventio.OCR\Parser.cs:line 23 at OCR_WebService.Controllers.OcrController.Post() in C:\Development\Projects\Inventio OCR\OCR WebService\Controllers\OcrController.cs:line 29 at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.<>c__DisplayClass6_2.b__2(Object instance, Object[] methodParameters) at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ActionExecutor.Execute(Object instance, Object[] arguments) at System.Web.Http.Controllers.ReflectedHttpActionDescriptor.ExecuteAsync(HttpControllerContext controllerContext, IDictionary`2 arguments, CancellationToken cancellationToken)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/charlesw/tesseract/issues/573, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAB7HSC32GWMULEUDRRQWADUBCJVBANCNFSM5DW4SKJA . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

charlesw avatar Sep 09 '21 20:09 charlesw

@charlesw That is true. I now set the bin path in the client code. If commandline tesseract is initialized with 'Path.GetDirectoryName( Assembly.GetExecutingAssembly().Location)' and if webservice 'HttpContext.Current.Server.MapPath("..")'.

HenrikHolmIT avatar Sep 13 '21 07:09 HenrikHolmIT