ggplot2-solutions
ggplot2-solutions copied to clipboard
Exercise 9.3.3 #3 Install EDWAR & tidy storms, population & tb datasets
https://github.com/kangnade/ggplot2-solutions/blob/e6ef9e3b271599f6e97afc0f8a3f012f276f9385/ggplot2_solutions_chapter9.Rmd#L56
The question requires installing the EDAWR package from GitHub.
As indicated, installation is done by devtools::install_github("rstudio/EDAWR")
using the package devtools.
The question says:
Tidy the storms, population and tb datasets.
-
storms: EDWAR::storms is a 6*4 tibble with 1 observation per row. I can't see any tidying needed.
-
population: it's in the long form, each country occupies multiple rows. To tidy it, use:
library(tidyverse); library(EDAWR)
population %>% pivot_wider(names_from = year, values_from = population)
-
tb: the tuberculosis dataset is more complex & requires both gathering the age variable & spreading the years over columns. Rationale: Maybe my solution isn't optimal. The tidiest I could do is to make a row for each combination of country name + sex + age, with the year spread to columns. In my opinion, this form makes it easiest to work with the dataset, especially via
filter()
&group_by()
. Use:tb %>% pivot_longer(cols = c(child, adult, elderly), names_to = "age", values_to = "n") %>%
pivot_wider(names_from = year, values_from = n)
Warmly