probablepeople
probablepeople copied to clipboard
:family: a python library for parsing unstructured western names into name components.
Was parsing a list of names for my wedding (super helpful!). And noticed that people with rarer names (first name, last name, etc). are being labeled as corporations (no corporations...
ORIGINAL STRING: Van Blarcom, Kevin P. PARSED TOKENS: [('Van', 'GivenName'), ('Blarcom,', 'Surname'), ('Kevin', 'GivenName'), ('P.', 'MiddleInitial')] UNCERTAIN LABEL: GivenName
```` >>> import probablepeople >>> probablepeople.tag('Andre and Amanda Smith') Traceback (most recent call last): File "", line 1, in File "c:\Python27\lib\site-packages\probablepeople\__init__.py", line 132, in tag raise RepeatedLabelError(raw_string, parse(raw_string), label) probablepeople.RepeatedLabelError:...
Hi, I have a database column called provider_name which can sometimes be a corporation_name like 'Midtown Dental Miami' or sometimes name of a dentist like 'Alek Klebaner DDS' or it...
I'm trying to improve performance of the parser on a fairly messy list containing individuals, households, and corporations. For individuals and households the parser works great. For corporations I see...
http://www.cs.cmu.edu/~einat/datasets.html
I have two middle names, so I put that into the parser, and it recognizes my second middle name as a surname. I understand it is probably more common to...
name count before: 1374 name count after: 762 Method 1. Remove NameCollection 2. Remove CR/LF 3. Add \n after `` 4. sort | uniq 5. Restore NameCollection
pip install probablepeople fails with a superlong sequence of errors. Here are the last lines: ``` /Users/bob/opt/miniconda3/envs/names/include/python3.11/pytypedefs.h:22:16: note: forward declaration of '_frame' typedef struct _frame PyFrameObject; ^ 2 warnings and...