ReadStat icon indicating copy to clipboard operation
ReadStat copied to clipboard

Feature Request: Support for SAS CPORT Files

Open billdenney opened this issue 6 years ago • 1 comments

I don't control the file type that I receive, and sometimes I get SAS CPORT files with the .xpt extension suggesting inaccurately that they are XPORT files (ref tidyverse/haven#453). It would be helpful if ReadStat could read CPORT files (or if not read them, at least give an informative error message that they are CPORT to point the user to the solution of resaving as XPORT).

billdenney avatar Jul 22 '19 12:07 billdenney

Like several others, I have just lost a day digging into this same situation. I was not able to find any code out there that reads CPORT files (I went through the source code of them all). Further digging shows that while SAS publishes the details needed to parse XPORT files, they do not do so for CPORT files. This post from the Library of Congress also states that CPORT is a proprietary format (as do other old blogs from folks quoting the FDA).

After digging through a CPORT with a hex editor, I don't think it would be terribly difficult to reverse engineer the format. I don't currently have the time. However, I agree it would be a nice addition for libraries to detect CPORT files and give a better warning. The first 80 in the file contain the text (unless the option NOCOMPRESS was used): COMPRESSED COMPRESSED COMPRESSED COMPRESSED COMPRESSED****** Following this, the is a string of bytes that list the OS and SAS versions used to create the file: LIB CONTROL X64_SR12¼^F SAS9.4

These should be straight-forward to write code to detect. Ideally, someone who has access to SAS just needs to create a minimal file and post it for the authors to tests against. Sadly, I don't have a copy.

bghill avatar Aug 06 '21 14:08 bghill