data
data copied to clipboard
Is there a description for the "category" field in IMDB ratings of Al Gore’s movie?
I opened Issue 35 in the fivethirtyeight R package repo asking for a clarification about the category variable in the ratings dataset; the one used for "Al Gore's New Movie Exposes The Big Flaw In Online Movie Ratings". @rudeboybert advised to check here.
The available categories are as follows:
> library(fivethirtyeight)
> levels(ratings$category)
[1] "Aged 18-29" "Aged 30-44" "Aged 45+" "Aged under 18" "Females"
[6] "Females Aged 18-29" "Females Aged 30-44" "Females Aged 45+" "Females under 18" "IMDb staff"
[11] "IMDb users" "Males" "Males Aged 18-29" "Males Aged 30-44" "Males Aged 45+"
[16] "Males under 18" "Non-US users" "Top 1000 voters" "US users"
However, it is not clear:
- Is the
Males under 18a subset of allMales, and if not, how do the categories differ? - Is there any intersection between the categories?
- If the number of respondents in
'Females Aged 18-29'+'Females Aged 30-44'+'Females Aged 45+'+'Females under 18'is less that the number of respondents in theFemalecategory. Is the gap due to respondents with unknown age?