wru icon indicating copy to clipboard operation
wru copied to clipboard

N (x%) individuals' last names were not matched.

Open 31YY88 opened this issue 1 year ago • 2 comments

Hi,

First of all, thank you for creating this amazing package! I have a question about the predict_race function. When I use it, I sometimes get a message saying "N (x%) individuals' last names were not matched." Is there a way to identify which last names were not matched?

I also noticed that all observations received some race probability and none were dropped. What are the probabilities that the observations with the unmatched last names receive? Are they based on the distribution of the population or on the probability of the race based on the other variables (gender, first-middle name)?

Thank you for your hard work and for creating such a useful tool!

31YY88 avatar Jul 12 '23 19:07 31YY88

This is related to #100. I think a good idea would be to add a parameter to "return unmatched" and provide a flag at the end of the returned data that is a boolean identifying they were matched or not to the names data. I'm not sure which probabilities are applied here. I'll double check that, but generally I think imputation of the mean probs for the geographical level would be appropriate if it's not already being done - including a notification to the end-user that this is happening.

1beb avatar Oct 10 '23 18:10 1beb

@mdblocker Can you look at this one? It should be the same as fixing the problem with missing geos and just returning what's missing.

1beb avatar Dec 08 '23 20:12 1beb