tesseract icon indicating copy to clipboard operation
tesseract copied to clipboard

Missing text from the image with high quality of 300 dpi

Open jyotiyadav94 opened this issue 2 years ago • 3 comments

I have an image which is of 300 dpi which is converted into grayscale. When I try to print the pytesseract.image_to_string with the configuration of config="--psm 6" it produces the below output

&- EUROCORPORATION PRODUTTORE D&D Costruzioni SRL CLIENTE D&D Costruzioni SRL

PORTA ROMANA - VIA METASTASIO 48 50124 (FI) Mario Nobile - Mob. 3381558401

FABRIZIO PIPOLO CF Produttore 06725110487

3357566640 Email Cliente [email protected]

23/02/2022 08:00 Email Produttore | [email protected]

OE

| Mattina | Dalle 08:00 Alle 12:00 PER LA LOGISTICA: CHIAMARE PER COMUNICARE ORARIO Florec 3200436592 https://www.google.com/maps/@43.7602578,11.2378078,3a, 75y ,244.78h,76.14t/data=!3m7!1e1!3m5! 1sd6ERXwcpkiHcDdpt_LkqMA!2e0!6shttps:%2F% 2Fstreetviewpixels-pa.googleapis.com%2Fv1%2Fthumbnail% 3Fpanoid%3Dd6ERXwcpkiHcDdpt_LkqMA%26cb_client% 3Dmaps-_sv.tactile.gps%26w%3D203%26h%3D100%26yaw% 3D322.34424%26pitch%3D0%26thumbfov%3D100!7i116384! 818192

170904.1 Calcinacci puliti Solido non pulverulento El EUROCORPORATION EUROCORPORATION S.r.I. C.F. - P.iva 05235640488 R.E.A. 531452 Capitale Sociale € 100.000,000 Via de’ Cattani 178 - 50145 Firenze | Tel. 055 7222419 Fax 055 7227520 [email protected] | [email protected] | www.eurocorporation.it Pag. 1 di 2 23/02/2022


Environment

  • Tesseract Version: 4
  • Commit Number: https://github.com/jyotiyadav94/ImageProcessing/blob/main/pytesseract_layoutlmv3.ipynb
  • Platform: windows

Current Behavior:

It is missing some strings like Unit. Loc, Operatore, Data Richiesta etc from the image

Expected Behavior:

It should detect all the strings from the image

Image - https://github.com/jyotiyadav94/ImageProcessing/blob/main/dataset/11page-0.PNG

jyotiyadav94 avatar Jul 14 '22 11:07 jyotiyadav94

We don't support 3rd party tools like pytesseract. We also don't support Tesseract 4.x.

Try again with Tesseract 5.2.0.

amitdo avatar Jul 15 '22 06:07 amitdo

@amitdo How to install tesseract 5.2.0 using Dockerfile? Locally I did brew install tesseract which worked.

ajinkyaG28 avatar Jul 26 '22 20:07 ajinkyaG28

@ajinkyaG28,

Please use our forum for asking questions.

amitdo avatar Jul 27 '22 03:07 amitdo