2016-new-coder-survey icon indicating copy to clipboard operation
2016-new-coder-survey copied to clipboard

List of interesting visualizations

Open SamAI-Software opened this issue 8 years ago • 5 comments

This issue is for project control purpose and it will be constantly updated. Latest website preview is here. Please, feel free to add some interesting visualizations. If you want to participate, you can find data here and questionnaire here. The goal is to create D3.js visualizations for all topics from this article and for some facts from this list. If you have any questions about data, you can ask them at issue #26. Leave your feedback and ideas about the next survey at issue #39.

The list of interesting visualizations:

Demographics

Socials

  • [ ] MaritalStatus, HasChildren [ #38 ], HasFinancialDependents [ #20 ], FinanciallySupporting
  • [ ] DebtAmount [ #19 ], HasHomeMortgage, HasStudentDebt
  • [ ] HasServedInMilitary [ #16 ]
  • [ ] IsReceiveDiabilitiesBenefits
  • [ ] HasHighSpdInternet

Education & Experience

Current job

  • [ ] EmploymentStatus [ #32 ], EmploymentField [ #12 ]
  • [ ] Income [ #42 ], IsUnderEmployed [ #27 ]
  • [ ] CommuteTime

Future job

  • [ ] JobPref,JobRoleInterest[ #44 ],ExpectedEarning[ #5 #20 #21 #22 #32 ]
  • [ ] JobWherePref [ #22 #32 #37 ], JobRelocate
  • [ ] JobApplyWhen [ #13 #22 ]

SamAI-Software avatar May 10 '16 12:05 SamAI-Software

Please, drop here a comment, when you start working on a new visualization, so it will be assigned to you to avoid duplicates.

!Important

As we are all working together on one project, please use recommended break points for consistency, unless you have some special approach and you want to use your own break points for that. If you have any questions, please write a comment in this issue, or ask directly @evaristoc or @SamAI-Software at FCC Data Science chat room.

The list or recommended groups:

  • Age
    • under 25
    • 25-29
    • 30-39
    • over 39 (40+)
  • Months Programming
    • <1 year (0-11 months)
    • 1-5 years (12-59 months)
    • 5+ years (60+ months)
  • Hours Learning
    • 0-9 hours
    • 10-29 hours
    • 30+ hours

Example:

example

SamAI-Software avatar May 13 '16 03:05 SamAI-Software

@SamAI-Software We need to agree and recommend the breaks of interval/ratio measurements (i.e. age, time, money, etc). We need to have some consistency in the way that data will be presented, if any.

evaristoc avatar May 19 '16 17:05 evaristoc

@evaristoc, very good point! :+1:
I was also thinking about consistency in break points and colors. I'll try to write a draft version asap today or tomorrow. And of course, feel free to suggest your system.

One question about your Podcast viz - why did you use 11+ months rather than 12+ months (1 year and more)?

SamAI-Software avatar May 20 '16 01:05 SamAI-Software

On age I've temporarily gone with 0-21, 22-25, 26-29, 30-33, 34+ primarily because it looked nice = 0 statistical validity 😃 codepen.io/krisgesling/pen/GZwYKV Note: Global stats is currently broken while I switch it to change for each tab. The map is also re-plotted everytime you select a tab which is silly and visually jarring so going to switch to d3 transforms instead.

krisgesling avatar May 20 '16 01:05 krisgesling

@krisgesling wow, great viz so far! :100:

Consistency in break points is more for bar charts, so don't worry about it, because you have a different approach - to show new visitors the Respondent's Profile by country in a simple and understandable way.

Your visualization would be probably the first one on the page, so it will be the beginning of a story, if @QuincyLarson won't mind.

Once users land on the page and see your viz, they should understand straight away:

  • (basic) in which country all these respondents live in;
  • (basic) what's their gender by country;
  • (basic) what's the average age in each country;
  • (special) how many respondents are ethnic minority.

And to do that you can find your own break points that will give the best division, so users will see the difference between countries and understand our story.


Country of living.

I see that you changed groups for the first map - all.

kris_all

And that's great, because now we have a good picture that shows that vast majority of respondents live in USA and India, but also there are many in Europe, North America, etc. :+1:


Gender by country.

But now let's look at the gender map.

kris_gender

What can we see here?

  • North Korea, Libya, Mozambique, Lesotho, Belize and Armenia are the top modern countries with high rate of educated women, where most coders are female, trans or agender.
  • Ethiopia, Niger and Zambia have more female coding learners than USA.
  • All coders in North Korea are trans, agender & genderqueer.

kris_nk

That's probably not kind of story we want to show... The reason behind these miss-leading facts is statistical dispersion / deviation. (@evaristoc or @erictleung can correct me with a proper English term for this) It's a well-known problem in statistic, so in every experiment researchers always set a minimum amount of cases (events) inside each group to avoid weird correlations. I'm sure that you already know that.The most famous example is coin-tossing. If you toss it 10 times, then you can get heads 8 times, which will lead you to a conclusion that odds are 4:1, while if you toss a coin for 10 000 times you would more likely to get 1:1 result. A few years ago we conducted an A/B testing with a very small conversion rate, so even 20 000 users didn't give us enough events to see the real result, until we put more than 100 000 new users in each group.

So what to do?

Minimum size of a group.

The solution can be very easy. For example, you can set a minimum amount of respondents for the country to be colored on the map, otherwise in will stay white. The good practice is to set the minimum size of a group to at least 100 people, but we don't have many respondents this year, so you can also set it to 50 or even to 20, if you feel that the result will give us a realistic picture.

Groups.

After setting the minimum size your previous groups won't give us a nice picture, so you can find a story that you want to tell and show it with a map. Here are some hints for you, that I find interesting:

  • China (25%) & Philippines (26%) have more female coding learners than Canada (22%) & Australia (19%);
  • Russia (16%) & Ukraine (19%) have more female respondents than UK (15%), Germany (13%) and France (11%);
  • Turkey (8,8%) has less female coders than Nigeria (12%), Egypt (11%) & South Africa (11%);
  • Mexico (6,5%) has less female coding learners than most Muslims countries like Egypt (11%), Indonesia (11%) & Turkey (8,8%)
  • Most countries of Central and South America have very low amount of female respondents (10% or less);
  • (Note) if you will set minimum size as 20, then be careful with South Korea (38%).

So my suggestions would be something like this:

25+% - USA, China, Philippines, South Korea (if min.size = 20); 20-24% - Canada; 15-19% - Ukraine, Russia, UK, Portugal, Australia, Malaysia...; 10-14% - Germany, France, Finland, Sweden, Spain, Indonesia, Nigeria, Egypt, South Africa, Brazil...; 0-9 - Mexico, Colombia, Venezuela, Vietnam, etc.


Ok, I think now you have some new ideas for your gender map and also for age & ethnic minority maps! Feel free to choose any path you like and good luck! :)

SamAI-Software avatar May 20 '16 08:05 SamAI-Software