Robert Sachunsky
Robert Sachunsky
For the record: Tesseract itself is a little weak in documenting this properly. (It happened when transitioning from version 3 to LSTM-based 4.) 1. `OSD` (as in `DetectOrientationScript()` or `DetectOS()`)...
Hi @Archilegt, sure, if you have suitable ground truth (i.e. training data, pairs of image and text for individual lines), you can do HTR with Tesseract, too. Modern OCR engines...
We are doing something very similar currently – see [here](https://wrznr.github.io/dhh-text-2021) for details (in German). Basically, if you want to follow above OCR-D based workflow (or variants of it with different...
For legacy models, the effect is there. For LSTM models, these kind of settings are not constraints but just hints. See issues/documentation in Tesseract itself.
> By the way, are there any embedded debug support for the `tesseract` app which can be activated? yes, you can: [build with debugging enabled](https://tesseract-ocr.github.io/tessdoc/Compiling-%E2%80%93-GitInstallation.html#debug-builds) and then enable any of...
> Why the characters recognized by `lstmeval` and `tesseract` are different? Is it normal? Yes, it's not unlikely, since the latter is much more complex – e.g. because it contains...
> Is this really a tesstrain issue? You are right, this should probably be transferred to the tesseract repo.
@jhartungBE all we have at this point are suspicions (what to look for). Have you tried … - `PSM=13` / `--psm 13` - with traineddata from `tessdata_best` / without `--convert_to_int`...
@jhartungBE, like I said in my [first comment](https://github.com/tesseract-ocr/tesstrain/issues/110#issuecomment-856294912), the Tesseract standalone CLI has much more than just the bare recognition of lstmeval – and that includes a check and compensation...
Yes, that's what it means. Just install ImageMagick and do a `convert input.png -negate output.png`