tesserocr
tesserocr copied to clipboard
Using PSM.AUTO_OSD or default doesn't make any difference
Hi,
I noticed that the text extracted from an image will be the same regardless of if I use PSM.AUTO_OSD or the default (PSM.AUTO according to the code).
Weirder yet, AUTO_OSD (which is OCR + OSD) takes about ~~half the time~~ the same time as the default while the latter isn't supposed to use OSD.
And even weirder, the default does in fact OSD since I can OCR 90/180/270 rotated images.
Is possible that the comments are wrong and the default is AUTO_OSD?
For the record: Tesseract itself is a little weak in documenting this properly. (It happened when transitioning from version 3 to LSTM-based 4.)
OSD(as inDetectOrientationScript()orDetectOS()) is a legacy feature (i.e. only available with the old engine still compiled in, and not deactivated viaoem=LSTM_ONLY). It also requires installing theosd.traineddatamodel (which contains samples from all major scripts for script detection). It is active inAUTO_OSD(as well asOSD_ONLYandSPARSE_TEXT_OSD). When active, it is used during layout analysis. That means, its scripts are added to the loaded languages, and its orientation (multiple of 90°) is applied – if the confidence threshold is met (i.e. the best score is at leastmin_orientation_marginaway from the next-best candidate).- Tesseract >= 4 also has orientation and skew detection independent of that (as part of
AnalyseLayout() / FindLines()and can be queried viaOrientation()in the page iterator). This is active inPSM.AUTO(as well asAUTO_OSD,AUTO_ONLYandSPARSE_TEXT_OSD). It is also used (after OSD, if any). It does not have confidence thresholds (that I know of). Besides transposing away multiples of 90°, it can also rotate arbitrarily to deskew.
Thus, there are two different implementations with different APIs and confusing overlap in terminology. See here for a feature comparison.
Now regarding your questions:
- Yes, you can therefore get the same results (including transposition and even rotation), irrespective of whether OSD is allowed.
- I don't think you would see a large toll of OSD on CPU time. But check that OSD can run properly to begin with (
osdmodel + OEM mode). - Yes, we need to improve documentation on that here.