PyRedactKit
PyRedactKit copied to clipboard
Consider using space instead of nltk for detecting names.
Checklist
- [x] There are no similar reports on existing issues (including closed ones).
- [x] I was in the
masterbranch of the latest code.
Is your feature request related to a problem? Please describe
Describe the solution you'd like
current nltk library is way too slow iterating through part of speech tagging. Consider using Cython loops spacy instead to identify names. Reference articles below. https://medium.com/huggingface/100-times-faster-natural-language-processing-in-python-ee32033bdced https://www.activestate.com/blog/natural-language-processing-nltk-vs-spacy/