vaken Race/ethnicity enum doesn't match Census standards

From the Census Bureau:

The U.S. Census Bureau considers race and ethnicity to be two separate and distinct concepts (source)

Race/ethnicity are messy concepts and I'm not arguing the Census' classification is perfect. However, us deviating from their methodology injures our ability to make comparisons to population level statistics.

The proper way to do this is to just mimic the Census, but with less fidelity (see below for what the Census does). I don't think we need to worry about a Native American's tribe, the specific Asian country of origin or the specific Hispanic origin so we can just do a boolean for Hispanic status and a "check all that apply" for race.

https://github.com/VandyHacks/vaken/blob/039f3dc77374432aa272559fb24977bf8920ffb5/src/common/schema.graphql.ts#L36

Jul 10 '19 18:07 bencooper222

I don't see the benefit of this. I think the existing implementation is fine, except we should also have an other field. Most hackathons only use this data to roughly estimate diversity, which the current categorization is sufficient for imho.

Jul 10 '19 18:07 cktang88

We save like 20 lines of code (we're talking about adding a checkbox and a field) and, in exchange, we get less accuracy with our estimates. More worryingly, our estimates become completely invalid in the formal sense because we can't quantify error at all. We're not running a study or something where that would be completely disqualifying but less accuracy to avoid basically no extra code seems like a bad choice.

Jul 10 '19 18:07 bencooper222

I don't get what is inaccurate about our current categorization. For example, the demographic stats we got from VH5 seems accurate and perfectly fine.

Jul 10 '19 19:07 cktang88

I can get us some hard numbers on which groups we'd have trouble capturing later. That said, I think we'd struggle to identify bad data if we continue with the approach of past years. How do you identify someone forced to misclassify?

Jul 10 '19 22:07 bencooper222

Well the other option would enable us to identify those that were previously misclassified.

Jul 10 '19 23:07 cktang88

Our numbers aren’t valuable by themselves - they’re valuable by comparison to others. That’s why I think we should stick to the same format that everyone else uses.

Jul 11 '19 01:07 bencooper222