probablepeople icon indicating copy to clipboard operation
probablepeople copied to clipboard

High error rates when surname has St.

Open az0 opened this issue 7 years ago • 1 comments

I noticed high error rates on a private data with the following surnames. (I can't list the full names here, but you might get a similar effect with most any given name) Most were classified as household or corporation.

St John St Clair St Clair St. Peter St. Marie St Romaine St. Jean St. Mark

This name from a public set identified as a person, but it thought St. was "and" Rebecca. St James

I think the training data set needs more examples like these

az0 avatar Oct 16 '17 04:10 az0

Just ran into this for input: SARAH ST. JOHN.

eggonabull avatar Jan 20 '20 10:01 eggonabull