recode_shadow() special missings are not accounted for by summary functions
Hi! Thanks so much for the package; it's such an important tool that the R ecosystem really needed!
I've been using recode_shadow() to handle some special missings, and while that works to change the shadow columns / update the factor levels, when I try to use functions to summarize characteristics of missingness in the dataframe, eg, add_any_miss() or miss_var_table(), it doesn't recognize these recoded special missings AS missing; it marks the first row of the dataframe as complete despite the -99 value for wind being a special missing. It might be nice if there were an option to choose whether NA aggregations distinguish between "true" / plain NA and special NAs, but if not, I think this omission could easily mislead someone about the completeness of their data.
library(naniar)
df <- tibble::tribble(
~wind, ~temp,
-99, 45,
68, NA,
72, 25
)
df
#> # A tibble: 3 × 2
#> wind temp
#> <dbl> <dbl>
#> 1 -99 45
#> 2 68 NA
#> 3 72 25
df_recode <- df |> bind_shadow() |>
recode_shadow(wind = .where(wind == -99 ~ "broken_machine"))
df_recode |> add_any_miss()
#> # A tibble: 3 × 5
#> wind temp wind_NA temp_NA any_miss_all
#> <dbl> <dbl> <fct> <fct> <chr>
#> 1 -99 45 NA_broken_machine !NA complete
#> 2 68 NA !NA NA missing
#> 3 72 25 !NA !NA complete
df_recode |> miss_var_table()
#> # A tibble: 2 × 3
#> n_miss_in_var n_vars pct_vars
#> <int> <int> <dbl>
#> 1 0 3 75
#> 2 1 1 25
Created on 2024-02-02 with reprex v2.1.0```
Hello!
Thank you very much for the kind words :)
I'm glad to hear that you are using the special missings feature, and this is a great point that there should be some way to support/account for them in the missingness summaries.
When I'm next able to get some time to do a sprint on naniar and visdat I will revisit this and touch base, hopefully that will be sooner (0-3 months) rather than later!