gssr icon indicating copy to clipboard operation
gssr copied to clipboard

Some NA-type codes not coded as such (but still 98 and 99)

Open LukasWallrich opened this issue 2 years ago • 0 comments

In the current version, some variables contain 98s and 99s that should be NAs - for instance, aidswho, sninews, exjobsat ...

The following code identifies some more suspicious ones (and obviously also includes a fair few false positives) ... but might help with finding the source of this ...

suspicious <- gss_all %>% summarise(across(everything(), ~(any(na.omit(.x == 98 | .x == 99)) & !any(na.omit(.x == 97))))) %>% pivot_longer(everything()) %>% filter(value)

gss_doc %>% filter(id %in% suspicious$name) %>% print(n = nrow(.))

# Output ids:

c("spind80", "tithing", "contemp", "prottemp", "jewtemp", "mslmtemp", "numcong", "wlthwhts", "wlthblks", "wlthhsps", "workwhts", "workblks", "workhsps", "intlwhts", "intlblks", "intlhsps", "yousup", "totmoney", "occfirst", "kdocc80", "minfour", "emailhr", "wwwhr", "emget", "othlang1", "othbest", "uswht", "usblk", "usamind", "yearval", "agerborn", "scinews3", "numorg", "aidswho", "nummen", "cosei10educ", "exjobsat", "leasthrs", "mosthrs", "prfmothr1", "usualhrs")

LukasWallrich avatar Aug 23 '22 16:08 LukasWallrich