faker
faker copied to clipboard
Update first_names_nonbinary with more androgynous names
https://github.com/joke2k/faker/blob/e2d255a942719c756426c995bb350ae5b23f119e/faker/providers/person/en_US/init.py#L770-L771
The existing implementation of nonbinary first names in en_US is just a copy of all of the female names and all of the male names. I'd like this list to be a subset of the more androgynous names, in the spirit of the original issue (#1205).
Proposal
Filter the existing dataset using a query like this, inspired by this project on data.world
# these are names with similar numbers of male and female reports since 1980.
# 'Similar' is currently defined as at least 1/3 of reported babies were in the minority gender for the name.
WITH counts AS (
SELECT name,
SUM(number) AS total,
SUM(number) FILTER(WHERE sex = 'M') AS male,
SUM(number) FILTER(WHERE sex = 'F') AS female
FROM babynamesustotal
WHERE yearofbirth > 1980
GROUP BY name
)
SELECT name, total, male, female
FROM counts
WHERE total/3 < male and total/3 <female
ORDER BY total desc
Initial idea: https://github.com/SFDO-Community-Sprints/Snowfakery-Nonprofit/issues/13
cc: @prescod
Sounds like a good idea. Do you have time to prepare a Pull Request?
Not right away, but yup!
This issue is stale because it has been open for 30 days with no activity.
I'm still planning to work on this.
This issue is stale because it has been open for 30 days with no activity.
This issue was closed because it has been inactive for 14 days since being marked as stale.