haven
haven copied to clipboard
Error in read_sas using catalog file
Hi, I am getting the error message "Error: Failed to parse formats.sas7bcat: Invalid file, or file has unsupported features. " when importing SAS data with a catalog file. This is the same error as in the closed issue #34. The data import without catalog file works.
I am using the latest haven version (2.5.0) and tested it with the development version on github.
fpath <- "path/to/sas/data/file"
catalog <- "formats.sas7bcat"
sas_data <- haven::read_sas(fpath, catalog_file = catalog)
Error: Failed to parse catlog.sas7bcat: Invalid file, or file has unsupported features.
Hi @ValValetl, thanks for the bug report.
Can you please share the catalog file and also some example data if possible? Without the catalog file it's not possible to track down the error.
Hi @gorcha Unfortunately, this is not possible at it is non-public data. I thought the issue report might still be of interested as issue #34 was closed a while ago, without any resolution of the issue.
Even if not the data, are you able to share the catalog file?
I need to check with the owner. I will get back to you later. Thanks for your quick responses!
Sorry for the long delay. Here is the catalog file that produces the error message: sas_catalog_file.zip
No worries at all, thanks!
Was this ever diagnosed? I'm running into the same issue.
Hi @joshuaborn, I haven't had a chance to look at this yet unfortunately but hopefully will over the next few weeks.
There's no guarantee that this is the same issue affecting you. Would you be able to provide an example file that I can test by any chance?
Hi, @gorcha . The particular file I first encountered the issue with was a restricted use file, but I've seen it with at least one other data set since then. I should have some time this weekend to try it out with public use data files, and if I can replicate it, I'll share.
Thanks @joshuaborn, much appreciated!
I neglected to follow-up on this back in September, but I was using Haven today and found a good example of this issue with public use data. Attached are four files from the National Survey of Family Growth 2017-2019 public use data. The d2017_2019femresp.sas7bdat and d2017_2019femresp.sas7bcat pair load using read_sas
just fine, but trying to use read_sas
with the d2017_2019fempreg.sas7bdat and d2017_2019fempreg.sas7bcat pair leads to an error message of the form
Error: Failed to parse .../d2017_2019fempreg.sas7bcat: Invalid file, or file has unsupported features.
Using read_sas
on just d2017_2019fempreg.sas7bdat without the catalog file works.
I'm using R version 4.2.2 on Windows 11 with Haven version 2.5.1.
The interesting thing about this example is that the pregnancy data table (d2017_2019fempreg) is ultimately derived from the female respondents table (d2017_2019femresp). I tried examining the two catalog files in SAS using PROC CATALOG
, but didn't see anything obvious in one, but not the other.
As an aside, since these parse errors seem to happen with catalog files more than with regular SAS data files, maybe it would be worth adding to Haven the ability to side-load value labels from a sas7bdat file or even a CSV file? It seems pretty straightforward to load another table and call labelled
as needed, and SAS can export its value labels to a regular data table easily with PROC CONTENTS
, etc. I would be willing to work on this, since it would save me time in the long run.
Hi @joshuaborn, thanks for the extra example file - there have been a few recent updates in the dev version of ReadStat for catalog file reading that might resolve these issues, I'll check it out.
I suspect this is a little different to the initial problem in this issue (which was specifically a problem with Unix 64 bit file formats), but there are some other bugs that have been fixed that might affect this one.
Hi @joshuaborn, can confirm that the recent ReadStat changes have fixed the issue with this file. They've just released an update over there so these should be in haven shortly!
Hi, @gorcha. Thanks for confirming that! And my apologies for resurrecting the wrong issue thread.
No worries at all!