tidyr
tidyr copied to clipboard
Feature request: function count the missing value
Dear developer,
There are some very useful functions that deal with the missing value in the package.
I wonder there is a chance to develop a function like SAS nmiss
or cmiss
that can
count the missing value.
I had a draft function below referring to the dplyr::coalesce
, but it is better that there is a more useful and robust function from your talent developer.
# Count number of missing value
library(dplyr, warn.conflicts = FALSE)
cmiss <- function(..., .blanks_to_na = TRUE) {
args <- rlang::list2(...)
if (length(args) == 0L) {
rlang::abort("`...` can't be empty.")
}
args <- vctrs::vec_recycle_common(!!!args)
stopifnot(length(.blanks_to_na) == 1, is.logical(.blanks_to_na))
if (.blanks_to_na) {
args <- purrr::map_if(args, is.character, ~ dplyr::na_if(.x, ""))
}
purrr::pmap_int(purrr::map(args, is.na), sum)
}
a <- c(1, 2, NA)
b <- c(3, NA, 4)
c <- "c"
d <- c("NA", "", NA)
# treat "" as `NA` by default
cmiss(a, b, c, d)
#> [1] 0 2 2
cmiss(a, b, c, d, .blanks_to_na = FALSE)
#> [1] 0 1 2
df <- data.frame(v1 = c("a", NA, "b", NA, NA),
v2 = c(NA, "c", "d", NA, NA),
v3 = c(letters[5:8], NA),
v4 = rep(NA, 5))
df %>%
mutate(n_miss = cmiss(v1, v2, v3, v4),
first_non_missing = coalesce(v1, v2, v3, v4))
#> v1 v2 v3 v4 n_miss first_non_missing
#> 1 a <NA> e NA 2 a
#> 2 <NA> c f NA 2 c
#> 3 b d g NA 1 b
#> 4 <NA> <NA> h NA 3 h
#> 5 <NA> <NA> <NA> NA 4 <NA>
Created on 2024-02-24 with reprex v2.1.0
Thank you very much!