OCR_preprocessing_tool
OCR_preprocessing_tool copied to clipboard
A simple OCR preprocessing tool using Python with a GUI.
OCR_preprocessing_tool
A simple OCR preprocessing tool using Python with a GUI.
This repo is modified from https://github.com/insaneyilin/document_scanner, and note_shrink.py
is modified from https://github.com/mzucker/noteshrink.
Usage
-
GUI - image rotation, binarization, edge detection, dilation/erosion, automatic/manual doc scanner, color inversion, and pdf to png conversion:
python OCR_preprocessing_tool.py
-
Command Line - automatic doc scanner:
python doc_scanner_app.py --image=<input_image_path>
-
Command Line - text compressing and enhancing:
python note_shrink.py IMAGE <input_image_path>
Run the code below for more tips:
python note_shrink.py -h
Dependencies
- Python 3
- Tkinter
- OpenCV
- Pillow
- NumPy
- Scipy
- pdf2image
pip install -r requirements.txt
Demo
Rotation
data:image/s3,"s3://crabby-images/05486/054862d39d2d5d0878d36f456575d8d46d0b29a9" alt=""
Binarization
data:image/s3,"s3://crabby-images/09eb6/09eb674e067d5698e09272f0d1eed0c9a1c1195d" alt=""
Edge detection
data:image/s3,"s3://crabby-images/3f8b2/3f8b23afea0282764818d601caa1f76c970ed45e" alt=""
Erosion
data:image/s3,"s3://crabby-images/18801/18801727a2c5d525619ae8a687917f7f545ea49c" alt=""
Dilation
data:image/s3,"s3://crabby-images/799c4/799c4909d77c2ce60c01ca9852d6c054928ef491" alt=""
Select corners manually
data:image/s3,"s3://crabby-images/4c399/4c39970bb9123dee775668f8e5b6cf749bff538f" alt=""
Auto detection (not very robust)
data:image/s3,"s3://crabby-images/024d7/024d74bd6c081f3f18c74c286a25be1b2d71623a" alt=""
Text enhancement (after applying perspective transform)
data:image/s3,"s3://crabby-images/6315e/6315ece1d3d9eed1cb1388489933d26cbab0ead2" alt=""
Conversion of pdf to png
data:image/s3,"s3://crabby-images/4b3e7/4b3e7672f0fe8f5c2fa2ce67aa44fd7d361194cb" alt=""
References
https://github.com/insaneyilin/document_scanner
https://github.com/mzucker/noteshrink
http://www.pyimagesearch.com/2014/09/01/build-kick-ass-mobile-document-scanner-just-5-minutes/
https://www.geeksforgeeks.org/convert-pdf-to-image-using-python/
https://www.geeksforgeeks.org/how-to-hide-recover-and-delete-tkinter-widgets/
http://vipulsharma20.blogspot.com/2016/01/document-scanner-using-python-opencv.html
https://github.com/lancebeet/imagemicro