wru
wru copied to clipboard
N (x%) individuals' last names were not matched.
Hi,
First of all, thank you for creating this amazing package! I have a question about the predict_race
function. When I use it, I sometimes get a message saying "N (x%) individuals' last names were not matched." Is there a way to identify which last names were not matched?
I also noticed that all observations received some race probability and none were dropped. What are the probabilities that the observations with the unmatched last names receive? Are they based on the distribution of the population or on the probability of the race based on the other variables (gender, first-middle name)?
Thank you for your hard work and for creating such a useful tool!
This is related to #100. I think a good idea would be to add a parameter to "return unmatched" and provide a flag at the end of the returned data that is a boolean identifying they were matched or not to the names data. I'm not sure which probabilities are applied here. I'll double check that, but generally I think imputation of the mean probs for the geographical level would be appropriate if it's not already being done - including a notification to the end-user that this is happening.
@mdblocker Can you look at this one? It should be the same as fixing the problem with missing geos and just returning what's missing.