pyreadr icon indicating copy to clipboard operation
pyreadr copied to clipboard

Cannot read R lists

Open DachuanZhao opened this issue 5 years ago • 10 comments

It gets wrong results when I load a RData.Here is the file FILE

DachuanZhao avatar Feb 28 '19 02:02 DachuanZhao

Would you be so kind of indicating what exactly is wrong? Also, I do not have access to the file, I have requested it through dropbox.

ofajardo avatar Feb 28 '19 07:02 ofajardo

image

When I load it in python,it only have one key.I think it should like this:

{"historicalCongestion":pandas.DataFrame,
"testData":list,
"trainingData":list}

DachuanZhao avatar Feb 28 '19 08:02 DachuanZhao

From the preadr readme, known limitations section :

Lists are not read

The limitation comes from librdata. I have filed an issue but it will take time to eventually get fixed.

ofajardo avatar Feb 28 '19 11:02 ofajardo

Hey @ofajardo,

I noticed that your issue was recently solved by this pull request.

I am working on a project that could use this update.

Are you already working on fixing this issue? Do you have a time horizon on when to fix this issue?

Thanks :)

JoaoCarabetta avatar Jul 10 '19 11:07 JoaoCarabetta

Hi Joao I am going to work on that next week, as this week I am attending EuroPython (Do you happen to be here?) But notice that pull request solves only reading a R vector with type DATE ( issue #9) it will not solve reading lists (this issue). Is this what you are expecting?

ofajardo avatar Jul 10 '19 13:07 ofajardo

Otto,

Unfortunately I am not attending the EuroPython, but I'll definitely check out the talks online.

About the problem, I actually need more than dates. My current .rds has shapefiles in it. I posted the problem in StackOverflow if you want more details.

Do you know any other solution to read those besides rpy2?

JoaoCarabetta avatar Jul 10 '19 13:07 JoaoCarabetta

interesting. Actually that shape file is an S3 object of class "sf" that is just on top of a normal data frame (which in turn is an object on top of a list). You can check this in R by doing class(shapeobject). The interesting thing is that pyreadr can read for example tibbles which are also S3 objects on top of data frames. I tried to change the class of this shape file to just data.frame and now pyreadr gives a different error: the file has unsupported features. So there are more things in this object that pyreadr/librdata cannot read.

As far as my knowledge goes you are out of luck and you will have to rely on rpy2. R interoperability is awful and most of the times you will need R in order to read R objects. The only other choice is to save them in some more interoperable format. =(

You can also submit an issue to librdata directly asking for support to read this objects. As they are on top of a data frame maybe it is not too hard for them to implement it, but no idea. If they implement the support then I can take that and expose it from pyreadr.

ofajardo avatar Jul 10 '19 16:07 ofajardo

Oh, now I see that the geometry this dataframe contains nested objects which seems to be built on top of a list of matrices (?). In addition of not reading lists, Librdata can also not read nested objects in dataframes, and in addition it does not read the dimensions of matrices, so supporting correctly this looks far fetched. Anyway you can raise the issue in librdata if you like, to raise the awareness on them that these missing featured are required by the community.

ofajardo avatar Jul 10 '19 16:07 ofajardo

Hey,

thanks for the attention.

I think we are changing the format to something more interpretable. Then we don't need to rely on several libraries to be updated.

But, if you think that the problem is interesting, this may have great value to future geo projects :)

JoaoCarabetta avatar Jul 11 '19 16:07 JoaoCarabetta

Issue tracked here: https://github.com/WizardMac/librdata/issues/32

ofajardo avatar Dec 17 '20 10:12 ofajardo