resume_parser
resume_parser copied to clipboard
Resume parser
A resume parser used for extracting information from resumes
Built with ❤︎ and :coffee: by Kumar Rajwani and Brian Njoroge
Features
- Extract name
- Extract email
- Extract mobile numbers
- Extract skills
- Extract total experience
- Extract college name
- Extract degree
- Extract designation
- Extract company names
Installation
- You can install this package using
pip install resume-parser
- Dependency of spacy
pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-2.3.1/en_core_web_sm-2.3.1.tar.gz
pip install importlib-metadata==3.2.0
- For NLP operations we use spacy and nltk. Install them using below commands:
# spaCy
python -m spacy download en_core_web_sm
# nltk
python -m nltk.downloader stopwords
python -m nltk.downloader punkt
python -m nltk.downloader averaged_perceptron_tagger
python -m nltk.downloader universal_tagset
python -m nltk.downloader wordnet
python -m nltk.downloader brown
python -m nltk.downloader maxent_ne_chunker
Supported File Formats
- PDF and DOCx and TXT files are supported on all Operating Systems
Usage
- Import it in your Python project
from resume_parser import resumeparse
data = resumeparse.read_file('/path/to/resume/file')
For first time it will take around a minute so please keep patience.
Result
The module would return a dictionary with result as follows:
{'degree': ['BSc','MSc'],
'designition': [
'content writer',
'data scientist',
'systems administrator',
],
'email': '[email protected]',
'name': 'Brian Njoroge',
'phone': '+918511593595',
'skills': [
'Python',
' C++',
'Power BI',
'Tensorflow',
'Keras',
'Pytorch',
'Scikit-Learn',
'Pandas',
'NLTK',
'OpenCv',
'Numpy',
'Matplotlib',
'Seaborn',
'Django',
'Linux',
'Docker'],
'total_exp': 3,
'university': ['gujarat university', 'wuhan university', 'egerton university']}
Custom training
You can use the following notebook to train the spacy model on your custom data. Following notebook is training the spacy model to identify the Degree in the given text. https://colab.research.google.com/drive/1aSn5tMWU2Lbo4eEPi0GvkBC_003mXxqi?usp=sharing