probablepeople
probablepeople copied to clipboard
High error rates when surname has St.
I noticed high error rates on a private data with the following surnames. (I can't list the full names here, but you might get a similar effect with most any given name) Most were classified as household or corporation.
St John St Clair St Clair St. Peter St. Marie St Romaine St. Jean St. Mark
This name from a public set identified as a person, but it thought St. was "and" Rebecca. St James
I think the training data set needs more examples like these
Just ran into this for input: SARAH ST. JOHN.