Recognizers-Text icon indicating copy to clipboard operation
Recognizers-Text copied to clipboard

Email not recognized as an entity if whitespace is present

Open americanthinker opened this issue 3 years ago • 0 comments

Describe the bug The recognize_entities method does not recognizes emails that include a whitespace in them as an Email entity. For example myname@outlook .com is not recognized as an Email, and neither is firstname. [email protected].

To Reproduce Steps to reproduce the behavior:

document_full_of_emails = ''' This email will be recognized: [email protected] This one will as well: [email protected] This email won't be recognized: ourname@outlook. com Only the last half will be recognized: firstname. [email protected] '''

Expected behavior emails = client.recognize_entities([document_full_of_emails])[0] for email in results.entities: print(email.text) [email protected] [email protected] [email protected] [email protected]

Actual Behavior emails = client.recognize_entities([document_full_of_emails])[0] for email in results.entities: print(email.text) [email protected] [email protected] [email protected].

Platform (please complete the following information):

  • Platform: Mac OS Monterey, Python 3.8.3
  • Environment: Local and remote on Azure VM, Jupyter Lab, Python script, doesn't seem to matter
  • Version of package: latest

americanthinker avatar Apr 21 '22 20:04 americanthinker