Add support for infinity values from sas7bdat files
Recommendation within SAS is to store infinity and negative infinity as the special missing values .I and .M respectively:
https://documentation.sas.com/doc/en/imlug/15.2/imlug_r_sect019.htm
Currently when using haven::read_sas() everything just gets read in as NA with no distinction
data outlib.test_special_values;
input
my_int
my_float
;
if _N_ = 1 then do;
my_int = .; /* Missing */
my_float = .; /* Missing */
end;
else if _N_ = 2 then do;
my_int = .I; /* Infinity */
my_float = .I; /* Infinity */
end;
else if _N_ = 3 then do;
my_int = .N; /* Not a Number */
my_float = .N; /* Not a Number */
end;
else if _N_ = 4 then do;
my_int = .M; /* Minus Infinity */
my_float = .M; /* Minus Infinity */
end;
datalines;
. .
. .
. .
. .
;
run;
> haven::read_sas("data/test_special_values.sas7bdat")
# A tibble: 4 × 2
my_int my_float
<dbl> <dbl>
1 NA NA
2 NA NA
3 NA NA
4 NA NA
That being said I wasn't sure if this needed to be raised upstream first with readstat, I wasn't able to see any obvious functions / metadata that would disambiguate between the different missings.
Hi @gowerc,
You can access the underlying tagged NA value using the tagged_na() functions.
To check for a particular value you can use for e.g. is_tagged_na(var, "i") (note that all tagged values come in as lower case), so to do the conversion you could do something like:
library(dplyr)
library(haven)
dat <- read_sas("data/test_special_values.sas7bdat")
convert_tagged_na <- function(x) {
case_when(
is_tagged_na(x, "i") ~ Inf,
is_tagged_na(x, "m") ~ -Inf,
is_tagged_na(x, "n") ~ NaN,
.default = x
)
}
dat |>
mutate(across(where(is.numeric), convert_tagged_na))
Makes sense, thank you !