gssr
gssr copied to clipboard
Some NA-type codes not coded as such (but still 98 and 99)
In the current version, some variables contain 98s and 99s that should be NAs - for instance, aidswho
, sninews
, exjobsat
...
The following code identifies some more suspicious ones (and obviously also includes a fair few false positives) ... but might help with finding the source of this ...
suspicious <- gss_all %>% summarise(across(everything(), ~(any(na.omit(.x == 98 | .x == 99)) & !any(na.omit(.x == 97))))) %>% pivot_longer(everything()) %>% filter(value)
gss_doc %>% filter(id %in% suspicious$name) %>% print(n = nrow(.))
# Output ids:
c("spind80", "tithing", "contemp", "prottemp", "jewtemp", "mslmtemp", "numcong", "wlthwhts", "wlthblks", "wlthhsps", "workwhts", "workblks", "workhsps", "intlwhts", "intlblks", "intlhsps", "yousup", "totmoney", "occfirst", "kdocc80", "minfour", "emailhr", "wwwhr", "emget", "othlang1", "othbest", "uswht", "usblk", "usamind", "yearval", "agerborn", "scinews3", "numorg", "aidswho", "nummen", "cosei10educ", "exjobsat", "leasthrs", "mosthrs", "prfmothr1", "usualhrs")