tidyr icon indicating copy to clipboard operation
tidyr copied to clipboard

Feature request: function count the missing value

Open ynsec37 opened this issue 1 year ago • 0 comments

Dear developer,

There are some very useful functions that deal with the missing value in the package. I wonder there is a chance to develop a function like SAS nmiss or cmiss that can count the missing value. I had a draft function below referring to the dplyr::coalesce, but it is better that there is a more useful and robust function from your talent developer.

# Count number of missing value
library(dplyr, warn.conflicts = FALSE)
cmiss <- function(..., .blanks_to_na = TRUE) {
  args <- rlang::list2(...)
  if (length(args) == 0L) {
    rlang::abort("`...` can't be empty.")
  }

  args <- vctrs::vec_recycle_common(!!!args)

  stopifnot(length(.blanks_to_na) == 1, is.logical(.blanks_to_na))

  if (.blanks_to_na) {
    args <- purrr::map_if(args, is.character, ~ dplyr::na_if(.x, ""))
  }

  purrr::pmap_int(purrr::map(args, is.na), sum)
}

a <- c(1, 2, NA)
b <- c(3, NA, 4)
c <- "c"
d <- c("NA", "", NA)

# treat "" as `NA` by default
cmiss(a, b, c, d)
#> [1] 0 2 2

cmiss(a, b, c, d, .blanks_to_na = FALSE)
#> [1] 0 1 2

df <- data.frame(v1 = c("a", NA, "b", NA, NA),
  v2 = c(NA, "c", "d", NA, NA),
  v3 = c(letters[5:8], NA),
  v4 = rep(NA, 5))
df %>%
  mutate(n_miss = cmiss(v1, v2, v3, v4),
    first_non_missing = coalesce(v1, v2, v3, v4))
#>     v1   v2   v3 v4 n_miss first_non_missing
#> 1    a <NA>    e NA      2                 a
#> 2 <NA>    c    f NA      2                 c
#> 3    b    d    g NA      1                 b
#> 4 <NA> <NA>    h NA      3                 h
#> 5 <NA> <NA> <NA> NA      4              <NA>

Created on 2024-02-24 with reprex v2.1.0

Thank you very much!

ynsec37 avatar Feb 24 '24 11:02 ynsec37