dip-project-business-card-recognition
dip-project-business-card-recognition copied to clipboard
This project aims to build a digital business card wallet, as a mobile application. This application basically can take a picture of a business card, automatically extracting information about the car...
Digital Image Processing Final Project - Business Card Recognition
- André Moreira Souza - N°USP: 9778985
- Josué Grâce Kabongo Kalala - N°USP: 9770382
Abstract
This project aims to build a digital business card wallet, as a mobile application. This application basically can take a picture of a business card, automatically extracting information about the card, and saves it in the user's digital wallet. During this process, there are digital image processing techniques, such as (1) image denoising / deblurring, (2) image segmentation, and natural language processing techniques for categorizing the extracted text (name, company, job position, email, location, ...). The application should be able to process and extract information from images with different perspectives and possible presence of undesired objects, such as in the examples below.
Data Sources
For this project, we will build a collection of images for testing purposes. Initially, the collected images are photos, taken from cellphones, of business cards of companies in São Paulo. Those will be used to test each step of the project.
Expected development stages
[x] - Corner detection - Detect/approximate corners of the business card of the input image. The user should be able to make adjustments when the detection is not accurate.
[x] - Perspective transform - Some images may be taken from different perspectives. This stage's objetive is to normalize the perspective, using on the corner points of the corner detection step, to enhance further operations.
[x] - Text and character recognition - Recognize the characters present in the text of the business card, and build strings with the sequences of recognized characters.
[ ] - Text categorization for the portuguese language - For each sentence, use NLP methods to categorize it into the following categories: entity, phone number, email, location
Extra development stages
[ ] - Logo detection
[ ] - Web scraping for more info about the extracted entities
[ ] - Text categorization for the english language
Examples of input images and outputs
Example 1
Expected extracted data for example 1
String | Category |
---|---|
HOLBORN & MOORGATE ENGINEERING | Entity |
+1 801 566-1800 | Phone |
+1 801 566-1801 | Phone |
hme.com | URL |
2011 S 1100 E | Unknown (coordinates) |
Salt Lake City, UT 84106 | Location |
Example 2
Expected extracted data for example 2
String | Category |
---|---|
lush | Entity |
LAWN + PROPERTY ENHANCEMENT | Entity |
JON | Entity |
248-343-5976 | Phone |
KAYLEN | Entity |
734-552-8728 | Phone |
6811 | Number |
CLINTOVILLE ROAD | Location |
CLARKSTON, MICHIGAN 48348 | Location |
WWW.LUSHMICHIGAN.COM | URL |
Example 3
Expected extracted data for example 3
String | Category |
---|---|
Daniel Whitton | Entity |
Cell: | Entity |
817-228-6401 | Phone |
Email: | Entity |
[email protected] | |
Website: | Entity |
www.DFWCarpentry.com | URL |
Address: | Entity |
18011 Bruno Road | Location |
Justin, TX 76247 | Location |
Final notes
As the examples of inputs and outputs have shown, the final application must be able to identify the text and categorize it based on the sentences found. The user must be able to edit the recognized strings and categories, when the results are not accurate.
Partial Report
Until the date of this commit (29/05/2019), we have built the image collection for test purposes, and implemented methods for corner detection, utilizing the Harris Corner Detector and Shi-Tomasi Corner Detector.
For the corner detection, we created functions for conversion from RGB to grayscale, computing image derivatives using the Sobel coefficient, and the corner detection function. We've utilized OpenCV's functions for denoising, and Scikit-image's functions for filtering and thresholding.
Final Report
A demonstration of the project, its detailed explanation and results discussion can be seen in this python notebook
Program Usage
The entire project can be executed using the shell program "shell.py".
Requirements
The following programs and Python packages need to be installed at the running system, for proper execution of the shell.py script:
- Python packages:
- External programs:
The shell supports the following commands:
-
exit
: finish the program and go out from the shell -
samples
: show to the user 9 business cards selected randomly from the sample data set. -
selectcard <number>
: select a businness card giving a number between 1 and 9. -
run
: perform the process to analise and extract the business card informations. -
help
: show the usage help message.
Just run python3 shell.py
.
Enter these commands:
-
samples
. This will show 9 images randomly selected from the sample dataset. -
selectcard 4
. This will select the fourth card. -
run
. This will run all the project steps. -
exit
. This will exit the shell program.
Notice that you have to close the window which shows the image in order to continue using the shell in the terminal.