Importing SAS formats catalogue with negative format values
Passing along an issue presented in the R haven package, which appears to be an upstream issue with ReadStat that haven uses: https://github.com/tidyverse/haven/issues/768.
To summarize, in the attached example (test.zip), if you have a sas7bdat file (test.sas7bdat) with a single numeric variable named x with values -7, 1, and 2 and a SAS format catalog file that defines the format (format.sas7bcat):
proc format;
value testf
-7="Missing"
1="Yes"
2="No"
;
run;
The format value -7 = "Missing" gets imported by haven (using ReadStat) as -0.625 = "Missing". They also noted that they can reproduce this error in pyreadstats as well and suggested it may be an upstream issue with ReadStat.
Some additional investigation by me (not in the attached example) suggests sort of deterministic pattern in between the original SAS format values and the transformed ReadStat values. I noticed that the lagged difference of the imported values change in increasing doubles 1x, 2x, 4x, and when the lag differences change, they descrease by a factor of 4 (e.g., 2.00 -> 0.50 -> 0.125).
SAS Format Value Imported value (Lagged difference of Imported Value)
-1 -4.0000000 N/A
-2 -2.0000000 2.000000000
-3 -1.5000000 0.500000000
-4 -1.0000000 0.500000000
-5 -0.8750000 0.125000000
-6 -0.7500000 0.125000000
-7 -0.6250000 0.125000000
-8 -0.5000000 0.125000000
... ... ...
Interesting. The relevant code is here:
https://github.com/WizardMac/ReadStat/blob/3438f3431911899ba52566180f258f405a53b12e/src/sas/readstat_sas7bcat_read.c#L104-L113
Positive values are encoded as negative double-precision floating points, but it looks like negative values aren't just sign-flipped positive values.
Would be great if you could supply another test file with more negative values that I can inspect.
Attached is an updated example, formats for values -300 to 2, where the format labels are just the character string of the number e.g., "-300".
Hope it helps!
Great! I think this should do the job? https://github.com/WizardMac/ReadStat/commit/974a3fe7d3047098a7d9c4d30a5f317be146479b