pyresparser icon indicating copy to clipboard operation
pyresparser copied to clipboard

Significant struggles with name identification

Open BenSturgeon opened this issue 4 years ago • 4 comments

Thank you very much for the work you've done on this.

While the results of this are currently fairly good I've noticed names are a big struggle. I even ran your resume as a sample through the system and it returned "www.omkarpathak.in" for that field.

Do you think adding negative patterns for it to check against is the smartest short term solution for this problem? Otherwise do you think more training is required on the part of the NLP model regarding names?

If you need access to more data I have access to a large amount of CVs which I'd be happy to share.

Thanks again for your continued work on this project.

BenSturgeon avatar Jun 12 '20 14:06 BenSturgeon

@BenSturgeon yes. We need a large dataset of resumes to train model to produce more robust results. If you can share the CVs it would be really helpful 😄

OmkarPathak avatar Jun 14 '20 14:06 OmkarPathak

@OmkarPathak Awesome, I'll send you an email with the google drive link containing a large amount of CVs.

I'd be happy to contribute by helping with labeling as well if you'd be interested in sharing the process with me.

BenSturgeon avatar Jun 16 '20 10:06 BenSturgeon

@BenSturgeon would be happy to share.

OmkarPathak avatar Jun 16 '20 13:06 OmkarPathak

Hi, So do we now have a large dataset? Would be great if it was open-source

aditya-malte avatar Oct 09 '20 08:10 aditya-malte