SASLib.jl icon indicating copy to clipboard operation
SASLib.jl copied to clipboard

Missing data from first META page

Open tk3369 opened this issue 6 years ago • 4 comments

It seems that data residing the in the first META page is missing. I guess it might be introduced in the last major refactoring.

Examples:

data_pandas/test2.sas7bdat

julia> readsas("data_pandas/test2.sas7bdat")
Read data_pandas/test2.sas7bdat with size 10 x 100 in 0.00088 seconds
SASLib.ResultSet (10 rows x 100 columns)
Columns 1:Column1, 2:Column2, 3:Column3, 4:Column4, 5:Column5, 6:Column6, 7:Column7, 8:Column8, 9:Column9, 10:Column10 …
1: 0.0, , 0.0, 1960-01-01, 0.0, , 0.0, 0.0, 0.0, 
2: 0.0, , 0.0, 1960-01-01, 0.0, , 0.0, 0.0, 0.0, 
3: 0.0, , 0.0, 1960-01-01, 0.0, , 0.0, 0.0, 0.0, 
4: 0.0, , 0.0, 1960-01-01, 0.0, , 0.0, 0.0, 0.0, 
5: 0.0, , 0.0, 1960-01-01, 0.0, , 0.0, 0.0, 0.0, 

data_AHS2013/omov.sas7bdat

The first 103 records are missing as compared with results from ReadStat.

tk3369 avatar May 18 '19 06:05 tk3369

I've encountered this bug recently. I have a dataset (that I unfortunately can't share) where it skips the first 48 rows. What ends up happening is it concatenates these "empty" rows at the bottom of the dataset - e.g. I see something like the above with 0.0 or blank values.

pmbaumgartner avatar Jan 07 '21 18:01 pmbaumgartner

Can u create a synthetic data and try to replicate the issue? Like similar missing but random values. Then we can see how it works.

On Fri, 8 Jan 2021, 05:09 Peter Baumgartner, [email protected] wrote:

I've encountered this bug recently. I have a dataset (that I unfortunately can't share) where it skips the first 48 rows. What ends up happening is it concatenates these "empty" rows at the bottom of the dataset - e.g. I see something like the above with 0.0 or blank values.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/tk3369/SASLib.jl/issues/53#issuecomment-756285599, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABCJ6JLKBLOO7KONZFOJ2LLSYX2HFANCNFSM4HNZUF7Q .

xiaodaigh avatar Jan 07 '21 21:01 xiaodaigh

I'll try and generate something that replicates this. I think it has something to do with the size of the dataset: I've got 1800 columns and that seems to upset whatever I throw at this.

pmbaumgartner avatar Jan 08 '21 01:01 pmbaumgartner

've got 1800 columns

If you can generate a synthetic one that fails I can log the file here too for other to test https://github.com/xiaodaigh/sas7bdat-resources

The hardest thing about SAS is to get sample files.

xiaodaigh avatar Jan 08 '21 02:01 xiaodaigh