haven icon indicating copy to clipboard operation
haven copied to clipboard

The exported file cannot be opened by SAS

Open Jiang-Li-backup opened this issue 7 years ago • 40 comments

I exported data to a SAS file. This file can be read by haven, but not the SAS. I am using SAS EG 7.1.

Jiang-Li-backup avatar Oct 08 '16 02:10 Jiang-Li-backup

Please provide a reproducible example.

hadley avatar Oct 08 '16 14:10 hadley

Thx. I also compared with the same data set created by SAS. The file generated by SAS EG is twice as big as that generated by R Haven. May I send the data files and code to you by email?

Jiang-Li-backup avatar Oct 09 '16 01:10 Jiang-Li-backup

Sure - both haven and SAS versions of the same file would be super helpful.

hadley avatar Oct 09 '16 13:10 hadley

I emailed the data and code to you a few days ago. Please let me know if you have not received it. Thx.

Jiang-Li-backup avatar Oct 12 '16 21:10 Jiang-Li-backup

Observation

Byte 33 has value 33 which indicates that the R SAS datasets use the layout of unix 64 bit.

I will try to read the R SAS dataset on Unix 64.

This should not matter, Win SAS should be able to read the dataset. I will look at the rest of the layout. Unfortunately, I don't have WPS on Unix so I will create a unix SAS SAS dataset(not WPS) so more of the R layout can be checked. SAS datasets seemed to have more filler which complicates the comparison.

At least the magic string matches

0-32 bytes Magic string 000000000000000000000000C2EA8160B31411CFBD92080009C7318C181F1011 WPS 000000000000000000000000C2EA8160B31411CFBD92080009C7318C181F1011 R

33-64 bytes 2222003333010232040000000000000000000301181F10112222003333010232 WPS 3300002200010031000000000000000000000000000000000000000000000000 R

rogerjdeangelis avatar Oct 26 '16 18:10 rogerjdeangelis

A quick note about the "filler". SAS files are organized into a "header" followed by 1 or more "pages". By default haven will write a header of size 1024 bytes and pages of size 4096 bytes. This layout was common with older versions of SAS, but newer versions seem to use much larger header and page sizes. The sizes are defined as 32-bit integers at (or near) byte offsets 200 and 204. In any event, extra-large page sizes are the source of filler in the SAS files. If you skip to the beginning of each page it should be easier to analyze the file.

For binary debugging I use hecate.

evanmiller avatar Oct 27 '16 06:10 evanmiller

Thanks for the info and the excellent work.

None of Matt's test datasets have byte 32=33 and byte 35=22. I have checked many other datasets created with different hardware and none of them have the 33/22 combination that write_sas creates

Test datasets for Matt SAS7BDAT

                      32 33 34 35 36 37

compress_no.sas7bdat 22 22 00 32 22 01 02 32 9.0101M3NET_ASRV........ compress_yes.sas7bdat 22 22 00 32 22 01 02 32 9.0101M3NET_ASRV........ lowbwt_i386.sas7bdat 22 22 00 32 22 01 02 32 9.0202M0W32_VSPRO....... missing_values.sas7bdat 22 22 00 32 22 01 02 32 9.0202M0W32_VSPRO....... obs_all_perf_1.sas7bdat 22 22 00 32 22 01 02 32 9.0101M3XP_PRO.......... adsl.sas7bdat 22 22 00 33 33 01 02 32 ....9.0202M3X64_ESRV.... eyecarex.sas7bdat 22 22 00 33 22 00 02 31 ....9.0000M0WIN......... lowbwt_x64.sas7bdat 22 22 00 33 33 01 02 32 ....9.0202M2X64_VSPRO... natlterr1994.sas7bdat 33 22 00 33 33 00 02 31 ........9.0101M3SunOS... natlterr2006.sas7bdat 33 22 00 33 33 00 02 31 ........9.0101M3SunOS... txzips.sas7bdat 33 22 33 22 00 33 33 01 02 31 ........9.0201M0Linux...

rogerjdeangelis avatar Nov 06 '16 17:11 rogerjdeangelis

@rogerjdeangelis Nice detective work!

evanmiller avatar Nov 07 '16 12:11 evanmiller

Hello @hadley, do you have some news about this issue ?

realitix avatar May 10 '17 15:05 realitix

I have the same issue. I can create a data frame from csv and write it to sas7bdat, however I can't open them in SAS viewer and results in error. I can see the file size has increased(i.e data is written within the file)

TEST.zip

Steps to reproduce:

Convert attached test.CSV to data frame

test <- read.csv("test.csv")

test.csv contain 5 obs

library("haven"); write_sas(test, "one.sas7bdat");

mnarasimhan02 avatar Jul 18 '17 21:07 mnarasimhan02

Is there a solution to this issue yet?

Kenkleinman avatar Aug 30 '17 19:08 Kenkleinman

It's really misleading that the readme for this project indicates that haven is capable of writing SAS data files when this and several related bugs indicate that it can't.

rambler avatar Nov 02 '17 20:11 rambler

@evanmiller Sorry for kicking up yet another old issue, but we are running into this problem as well. Data written using write_sas cannot be loaded by SAS (9.4), but works fine when read using read_sas from inside R.

The following is enough to reproduce it on Windows Server 2012R2 using the latest (master) version of haven and SAS 9.4. The exported file dataset.sas7bdat fails to load with an error that the dataset is not a SAS dataset.

library(haven)
dat <- as.data.frame(matrix(1, 10, 10))
write_sas(dat, 'dataset.sas7bdat')

normark avatar Jan 11 '18 21:01 normark

Though not a direct solution, if an R > SAS workflow is necessary for your work it looks like haven::write_sav() could be a temporary workaround. SAS can then read in the .sav file and variable labels, formats, etc persist (whereas they wouldn't with readr::write_csv(), etc).

So, in R:

library(tidyverse)
library(haven)
# View first 10 obs
diamonds(head, 10)

#> # A tibble: 10 x 10
#>    carat       cut color clarity depth table price     x     y     z
#>    <dbl>     <ord> <ord>   <ord> <dbl> <dbl> <int> <dbl> <dbl> <dbl>
#>  1  0.23     Ideal     E     SI2  61.5    55   326  3.95  3.98  2.43
#>  2  0.21   Premium     E     SI1  59.8    61   326  3.89  3.84  2.31
#>  3  0.23      Good     E     VS1  56.9    65   327  4.05  4.07  2.31
#>  4  0.29   Premium     I     VS2  62.4    58   334  4.20  4.23  2.63
#>  5  0.31      Good     J     SI2  63.3    58   335  4.34  4.35  2.75
#>  6  0.24 Very Good     J    VVS2  62.8    57   336  3.94  3.96  2.48
#>  7  0.24 Very Good     I    VVS1  62.3    57   336  3.95  3.98  2.47
#>  8  0.26 Very Good     H     SI1  61.9    55   337  4.07  4.11  2.53
#>  9  0.22      Fair     E     VS2  65.1    61   337  3.87  3.78  2.49
#> 10  0.23 Very Good     H     VS1  59.4    61   338  4.00  4.05  2.39

#  Export to sav
diamonds %>%
  head(10) %>%
  write_sav( "<path-to-file>")

Then, in SAS:

/* Import from sav */
proc import out = diamonds
            datafile = "<path-to-file>"
            dbms = SAV replace;
run;
***
NOTE: The import data set has 10 observations and 10 variables.
NOTE: WORK.DIAMONDS data set was successfully created.
NOTE: PROCEDURE IMPORT used (Total process time):
      real time           0.03 seconds
      cpu time            0.01 seconds
**;


/* View the data */
ods listing;
proc print data = diamonds;
run;

***
                                       The SAS System

 Obs    carat cut       color clarity    depth    table    price        x        y        z

   1     0.23 Ideal       E    SI2       61.50    55.00      326     3.95     3.98     2.43
   2     0.21 Premium     E    SI1       59.80    61.00      326     3.89     3.84     2.31
   3     0.23 Good        E    VS1       56.90    65.00      327     4.05     4.07     2.31
   4     0.29 Premium     I    VS2       62.40    58.00      334     4.20     4.23     2.63
   5     0.31 Good        J    SI2       63.30    58.00      335     4.34     4.35     2.75
   6     0.24 Very Good   J    VVS2      62.80    57.00      336     3.94     3.96     2.48
   7     0.24 Very Good   I    VVS1      62.30    57.00      336     3.95     3.98     2.47
   8     0.26 Very Good   H    SI1       61.90    55.00      337     4.07     4.11     2.53
   9     0.22 Fair        E    VS2       65.10    61.00      337     3.87     3.78     2.49
  10     0.23 Very Good   H    VS1       59.40    61.00      338     4.00     4.05     2.39

**;

adam-garcia avatar Feb 09 '18 23:02 adam-garcia

Can you please try again with the development version of haven?

hadley avatar Feb 15 '18 17:02 hadley

The free version of WPS, WPS Express, can create a SAS dataset of unlimited size with 'Proc R. WPS interfaces with R in the same way SAS/IML interfaces with R.

It looks like there are no 'non-commercial' limitations with express.

download from https://www.worldprogramming.com/us/try-or-buy/wps/editions/express

You can use haven to import large SAS datasets and WPS/Proc R to create a unlimited output SAS dataset. Not sure but the same may be true for 'proc python(WPS van create a SAS dataset from Python.

Otherwise I like to use mySQL or MS Access to move data from/to SAS.

On Thu, Feb 15, 2018 at 12:18 PM, Hadley Wickham [email protected] wrote:

Can you please try again with the development version of haven?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tidyverse/haven/issues/224#issuecomment-365998294, or mute the thread https://github.com/notifications/unsubscribe-auth/AMc0hW-Eb6API9VZ2QvfXm-w1awg3kh4ks5tVGbrgaJpZM4KRl8G .

rogerjdeangelis avatar Feb 15 '18 17:02 rogerjdeangelis

There is some information about interfaces between WPS/SAS and pyhon/R on my github

https://github.com/rogerjdeangelis

On Thu, Feb 15, 2018 at 12:34 PM, Roger DeAngelis <[email protected]

wrote:

The free version of WPS, WPS Express, can create a SAS dataset of unlimited size with 'Proc R. WPS interfaces with R in the same way SAS/IML interfaces with R.

It looks like there are no 'non-commercial' limitations with express.

download from https://www.worldprogramming.com/us/try-or-buy/wps/editions/express

You can use haven to import large SAS datasets and WPS/Proc R to create a unlimited output SAS dataset. Not sure but the same may be true for 'proc python(WPS van create a SAS dataset from Python.

Otherwise I like to use mySQL or MS Access to move data from/to SAS.

On Thu, Feb 15, 2018 at 12:18 PM, Hadley Wickham <[email protected]

wrote:

Can you please try again with the development version of haven?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/tidyverse/haven/issues/224#issuecomment-365998294, or mute the thread https://github.com/notifications/unsubscribe-auth/AMc0hW-Eb6API9VZ2QvfXm-w1awg3kh4ks5tVGbrgaJpZM4KRl8G .

rogerjdeangelis avatar Feb 15 '18 18:02 rogerjdeangelis

Just installed the dev version. The exported sas file still can not be opened by SAS Enterprise Guide 7.1. image

Jiang-Li-backup avatar Feb 15 '18 18:02 Jiang-Li-backup

same issue as @Jiang-Li An observation: library(haven) class<-read_sas("class.sas7bdat")
write_sas(class, "class_fromR.sas7bdat") class_fromr <-read_sas("class_fromR.sas7bdat")
The first read_sas gives back a tibble "class", but the second one class_fromr is a data frame.

identical(class, class_fromr) [1] FALSE

bellafeng avatar Feb 21 '18 05:02 bellafeng

same issue here

xiaodaigh avatar Feb 26 '18 20:02 xiaodaigh

Ok, we'll keep trying. Thanks for all the feedback!

hadley avatar Feb 26 '18 20:02 hadley

@hadley What is the latest development on this issue? I have the same problem!

ningjingzhiyuan507 avatar Jul 31 '18 12:07 ningjingzhiyuan507

I can confirm that I am also having issues, as described here, on the latest version of the CRAN package and R/Rstudio.

al-obrien avatar Sep 18 '18 21:09 al-obrien

Same issue.

JoviaNierenberg avatar Oct 11 '18 21:10 JoviaNierenberg

Same issue.

I still have the same problem here for haven package.

caimiao0714 avatar Oct 24 '18 19:10 caimiao0714

Same issue. Here's all the debug information I could think to collect.

SAS / Linux version: NOTE: Copyright (c) 2002-2012 by SAS Institute Inc., Cary, NC, USA. NOTE: SAS (r) Proprietary Software 9.4 (TS1M2) NOTE: This session is executing on the Linux 3.0.101-108.21-default (LIN X64) platform. NOTE: Updated analytical products:

      SAS/STAT 13.2
      SAS/ETS 13.2
      SAS/OR 13.2
      SAS/IML 13.2
      SAS/QC 13.2

R/Linux Versions:

R.Version() $platform [1] "x86_64-conda_cos6-linux-gnu"

	$arch
	[1] "x86_64"

	$os
	[1] "linux-gnu"

	$system
	[1] "x86_64, linux-gnu"

	$status
	[1] ""

	$major
	[1] "3"

	$minor
	[1] "4.3"

	$year
	[1] "2017"

	$month
	[1] "11"

	$day
	[1] "30"

	$`svn rev`
	[1] "73796"

	$language
	[1] "R"

	$version.string
	[1] "R version 3.4.3 (2017-11-30)"

	$nickname
	[1] "Kite-Eating Tree"

Package Versions: haven version 1.1.0

file size comparison mtcars exported form haven vs. mtcars created natively in SAS (from csv exported from R): -rw-r--r-- 1 9216 Nov 9 11:35 haven_mtcars.sas7bdat -rw-r--r-- 1 1236 Nov 9 11:21 mtcars.txt -rw-r--r-- 1 131072 Nov 9 11:34 sas_mtcars.sas7bdat

example.tar.gz

carloswpla avatar Nov 09 '18 18:11 carloswpla

Hi, Is there any update on this anywhere - I'm having the same issue. write_sas() files cannot be opened in SAS (9.4)

oljees avatar Nov 21 '18 15:11 oljees

Still an issue with haven 2.0.0. Perhaps the issue could be noted in the documentation until a fix is found. By the way, SAS has never published the technical specs for .sas7bdat files, which are Windows-specific. The SAS transport format (.xpt extension) is platform independent and openly documented. These files can be read and written with the foreign and SASxport packages.

mlaviolet avatar Nov 26 '18 13:11 mlaviolet

Same issue. Easiest way to replicate it

step 1

library(haven) write_sas(mtcars,"mtcars.sas7bdat")

step 2

open mtcars.sas7bdat in SAS and it says not compatible etc...

anuj2054 avatar Dec 04 '18 21:12 anuj2054