Docify
Docify copied to clipboard
A service for extracting text from ID cards in India, like Aadhar Card, PAN Card and Driving Licence. You just need to click and send a picture of the card to the API and get a json with your details....
Docify
Deep Learning based Flask api to extract details from Indian ID cards like Aadhar Card, PAN Card and Driving Licence.
Tech
Docify uses a number of open source projects to work properly:
- Tesseract - Tesseract Open Source OCR Engine
- Text-Detection-CTPN - Text detection mainly based on ctpn model in tensorflow
- Python3.6 - duh
Installation
Install Linux Dependencies
$ sudo apt install cmake
$ sudo apt install tesseract-ocr
$ sudo apt install mongodb
$ sudo apt install libsm6 libxext6
$ sudo apt install supervisor
$ sudo systemctl start mongo
Download Tesseract Models [ENG+HIN+MAR]
https://github.com/tesseract-ocr/tessdata_best
https://github.com/BigPino67/Tesseract-MICR-OCR
Install Python-Dependencies
$ pip3 install opencv-python easydict flask face_recognition gunicorn tensorflow keras pytesseract dlib imutils opencv-contrib-python pymongo PyYAML scikit-image scikit-learn
Start Python Api
python3 server.py