Email not recognized as an entity if whitespace is present
Describe the bug
The recognize_entities method does not recognizes emails that include a whitespace in them as an Email entity. For example
myname@outlook .com is not recognized as an Email, and neither is firstname. [email protected].
To Reproduce Steps to reproduce the behavior:
document_full_of_emails = ''' This email will be recognized: [email protected] This one will as well: [email protected] This email won't be recognized: ourname@outlook. com Only the last half will be recognized: firstname. [email protected] '''
Expected behavior
emails = client.recognize_entities([document_full_of_emails])[0]
for email in results.entities: print(email.text)
[email protected]
[email protected]
[email protected]
[email protected]
Actual Behavior
emails = client.recognize_entities([document_full_of_emails])[0]
for email in results.entities: print(email.text)
[email protected]
[email protected]
[email protected].
Platform (please complete the following information):
- Platform: Mac OS Monterey, Python 3.8.3
- Environment: Local and remote on Azure VM, Jupyter Lab, Python script, doesn't seem to matter
- Version of package: latest