Tc_ID_Card_OCR
Tc_ID_Card_OCR copied to clipboard
Automatic information extraction from identity card with ocr
Usage
Arguments
--folder_name: folder path--neighbor_box_distance: Nearest box distance--face_recognition: Face recognition method (dlib, ssd, haar)--rotation_interval: Id card rotation interval in degrees--ocr_method: ocr method (EasyOcr and TesseractOcr)
In Dlib and Haar face detection model, it is better to choose a rotation angle of less than 30 degrees, otherwise no face may be detected due to image inversion. Create a folder and put the ID card images in that folder
git clone [email protected]:musimab/Tc_ID_Card_OCR.git
mkdir images
python3 main.py --folder_name "images" --neighbor_box_distance 60 --face_recognition ssd --ocr_method EasyOcr --rotation_interval 60
create python3 virtual enviroment and install dependencies
python3 -m venv card_id_ocr_venv
source card_id_ocr_venv/bin/activate
pip3 install -r requirements.txt
The result image and cropped regions will be saved to ./outputs by default.
The json data will be saved to ./test by default.
Finds all virtual envs in linux `locate -b '\activate' | grep "/home"``
TODOs
- deep learning based (Yolo SSD Faster Rcnn) identity card recognition model will be developed
Algorithm Pipeline

Input image

Warped image

CRAFT Character Density Map

Unet Output for character density map

Craft Output(red boxes) and Matched Boxes(blue boxes)

Ocr Output
Tc : 12345678909 Surname : MUSTAFA ALİ Name : YILMAZ DateofBirth : 07071999
Ocr Evaluation
The accuracy of the optical character system was evaluated according to 2 different criteria. The first of these is accuracy at the word level and the other is accuracy at the character level.
The evaluate.py function retrieves the predicted and actual values in json format
Character Level Comparision
- tc: 1303 / 1327 => 98.19 %
- surname: 805 / 816 => 98.65 %
- name: 742 / 746 => 99.46 %
- dateofbirth: 976 / 976 => 100.0 %
Word Level Comparision
- tc : 0.96 %
- surname : 0.91 %
- name : 0.95 %
- date: 1.0 %
Easy Ocr
https://github.com/sarra831/EasyOCR