ggplot2-solutions icon indicating copy to clipboard operation
ggplot2-solutions copied to clipboard

Exercise 9.3.3 #3 Install EDWAR & tidy storms, population & tb datasets

Open HossamGhorab opened this issue 3 years ago • 0 comments

https://github.com/kangnade/ggplot2-solutions/blob/e6ef9e3b271599f6e97afc0f8a3f012f276f9385/ggplot2_solutions_chapter9.Rmd#L56

The question requires installing the EDAWR package from GitHub. As indicated, installation is done by devtools::install_github("rstudio/EDAWR") using the package devtools.

The question says:

Tidy the storms, population and tb datasets.

  • storms: EDWAR::storms is a 6*4 tibble with 1 observation per row. I can't see any tidying needed.

  • population: it's in the long form, each country occupies multiple rows. To tidy it, use: library(tidyverse); library(EDAWR) population %>% pivot_wider(names_from = year, values_from = population)

  • tb: the tuberculosis dataset is more complex & requires both gathering the age variable & spreading the years over columns. Rationale: Maybe my solution isn't optimal. The tidiest I could do is to make a row for each combination of country name + sex + age, with the year spread to columns. In my opinion, this form makes it easiest to work with the dataset, especially via filter() & group_by(). Use: tb %>% pivot_longer(cols = c(child, adult, elderly), names_to = "age", values_to = "n") %>% pivot_wider(names_from = year, values_from = n)

Warmly

HossamGhorab avatar Oct 02 '21 19:10 HossamGhorab