haven icon indicating copy to clipboard operation
haven copied to clipboard

Add support for infinity values from sas7bdat files

Open gowerc opened this issue 4 months ago • 2 comments

Recommendation within SAS is to store infinity and negative infinity as the special missing values .I and .M respectively: https://documentation.sas.com/doc/en/imlug/15.2/imlug_r_sect019.htm

Currently when using haven::read_sas() everything just gets read in as NA with no distinction

data outlib.test_special_values;
    input
        my_int
        my_float
    ;
    if _N_ = 1 then do;
        my_int = .;         /* Missing */
        my_float = .;       /* Missing */
    end;
    else if _N_ = 2 then do;
        my_int = .I;        /* Infinity */
        my_float = .I;      /* Infinity */
    end;
    else if _N_ = 3 then do;
        my_int = .N;        /* Not a Number */
        my_float = .N;      /* Not a Number */
    end;
    else if _N_ = 4 then do;
        my_int = .M;        /* Minus Infinity */
        my_float = .M;      /* Minus Infinity */
    end;
    datalines;
. .
. .
. .
. .
;
run;
> haven::read_sas("data/test_special_values.sas7bdat")
# A tibble: 4 × 2
  my_int my_float
   <dbl>    <dbl>
1     NA       NA
2     NA       NA
3     NA       NA
4     NA       NA

That being said I wasn't sure if this needed to be raised upstream first with readstat, I wasn't able to see any obvious functions / metadata that would disambiguate between the different missings.

gowerc avatar Aug 05 '25 16:08 gowerc

Hi @gowerc,

You can access the underlying tagged NA value using the tagged_na() functions.

To check for a particular value you can use for e.g. is_tagged_na(var, "i") (note that all tagged values come in as lower case), so to do the conversion you could do something like:

library(dplyr)
library(haven)

dat <- read_sas("data/test_special_values.sas7bdat")

convert_tagged_na <- function(x) {
  case_when(
    is_tagged_na(x, "i") ~ Inf,
    is_tagged_na(x, "m") ~ -Inf,
    is_tagged_na(x, "n") ~ NaN,
    .default = x
  )
}

dat |>
  mutate(across(where(is.numeric), convert_tagged_na))

gorcha avatar Nov 23 '25 12:11 gorcha

Makes sense, thank you !

gowerc avatar Nov 24 '25 10:11 gowerc