pyam icon indicating copy to clipboard operation
pyam copied to clipboard

Metadata attribute

Open znicholls opened this issue 7 years ago • 8 comments

In the climate community, we often read data from netCDF files. These files have a huge amount of metadata contained within them (when the file was written, who by, how etc. etc.). For SCMs, we often read these files in, take some average and then use them for whatever.

Given the usefulness of the IamDataFrame, we would want to read them into that format. However, in such a dataframe, there is nowhere obvious to store all the metadata.

Would we consider adding a metadata attribute, where we could store whatever we wanted (dict of metadata, strings) that we don't check in any way, but is simply provided as a catch all for users who need somewhere to store extra stuff. It doesn't belong in meta because 'Who wrote the file' isn't something we want in such a meta dataframe, I don't think?

znicholls avatar Oct 25 '18 15:10 znicholls

I don't see why adding such attributes to the meta table would be problematic. In particular if you think about reading data from multiple sources (files) using append(), it would make sense to me. to being able to keep track of it there, immediately offering the option to filter a merged IamDataFrame back to its components using df.filter(source=file_name) or similar (assuming that the filename is added as a meta column during import).

danielhuppmann avatar Oct 30 '18 13:10 danielhuppmann

If you think it's possible that would be great. I'm just concerned about carrying around 50 columns (or more, magicc has 200 parameters that we would ideally store in this attribute) of metadata but maybe that concern is unfounded?

On Tue, 30 Oct 2018 at 2:08 pm, Daniel Huppmann [email protected] wrote:

I don't see why adding such attributes to the meta table would be problematic. In particular if you think about reading data from multiple sources (files) using append(), it would make sense to me. to being able to keep track of it there, immediately offering the option to filter a merged IamDataFrame back to its components using df.filter(source=file_name) or similar (assuming that the filename is added as a meta column during import).

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IAMconsortium/pyam/issues/126#issuecomment-434291647, or mute the thread https://github.com/notifications/unsubscribe-auth/AWh-m7jB1wOrx0lf12s-sZp7H-w_LIi3ks5uqE81gaJpZM4X6mSK .

znicholls avatar Oct 30 '18 18:10 znicholls

@gidden @danielhuppman a little nudge on this one

znicholls avatar Nov 19 '18 14:11 znicholls

Hey @znicholls, @rgieseke and I were discussing this during IAMC and had a thought. Given that there are massive metadata requirements (and perhaps also gridded data!), would it make more sense to develop a class for openscm data that is based on an xarray.dataset? We could develop a similar interface as pyam.IamDataFrame and/or just provide to/from utils to go back and forth?

gidden avatar Nov 19 '18 14:11 gidden

@rgieseke and I also thought it would be a good idea to have a short monthly call to discuss efforts in slightly more detail. maybe this would be a good option for the first one?

gidden avatar Nov 19 '18 14:11 gidden

I think so. That would basically settle the question of creating a superclass of IamDataFrame which OpenSCMDataFrame could then subclass right? If yes, I'll put tidying up those MRs on the list to start to make room for the changes and give us a set of tests to help us keep track of where things are at as we switch the backend.

On Mon, 19 Nov 2018 at 2:55 pm, Matthew Gidden [email protected] wrote:

Hey @znicholls https://github.com/znicholls, @rgieseke https://github.com/rgieseke and I were discussing this during IAMC and had a thought. Given that there are massive metadata requirements (and perhaps also gridded data!), would it make more sense to develop a class for openscm data that is based on an xarray.dataset? We could develop a similar interface as pyam.IamDataFrame and/or just provide to/from utils to go back and forth?

— You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub https://github.com/IAMconsortium/pyam/issues/126#issuecomment-439919959, or mute the thread https://github.com/notifications/unsubscribe-auth/AWh-m5wJ7lF0hHJtDloRlJx84keaqc1Yks5uwsZLgaJpZM4X6mSK .

znicholls avatar Nov 19 '18 14:11 znicholls

Yep sounds good. I'm assuming we want to try and get one in before Christmas?

On Mon, 19 Nov 2018 at 2:58 pm, Matthew Gidden [email protected] wrote:

@rgieseke https://github.com/rgieseke and I also thought it would be a good idea to have a short monthly call to discuss efforts in slightly more detail. maybe this would be a good option for the first one?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/IAMconsortium/pyam/issues/126#issuecomment-439921254, or mute the thread https://github.com/notifications/unsubscribe-auth/AWh-m-7_yIreOv5UHGqOFcxvh5HJuDsYks5uwscjgaJpZM4X6mSK .

znicholls avatar Nov 19 '18 15:11 znicholls

Would be happy to. I can try to make a doodle poll, but if I dally for too long, feel free to jump in =)

On Mon, Nov 19, 2018 at 4:00 PM Zeb Nicholls [email protected] wrote:

Yep sounds good. I'm assuming we want to try and get one in before Christmas?

On Mon, 19 Nov 2018 at 2:58 pm, Matthew Gidden [email protected] wrote:

@rgieseke https://github.com/rgieseke and I also thought it would be a good idea to have a short monthly call to discuss efforts in slightly more detail. maybe this would be a good option for the first one?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/IAMconsortium/pyam/issues/126#issuecomment-439921254 , or mute the thread < https://github.com/notifications/unsubscribe-auth/AWh-m-7_yIreOv5UHGqOFcxvh5HJuDsYks5uwscjgaJpZM4X6mSK

.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/IAMconsortium/pyam/issues/126#issuecomment-439921894, or mute the thread https://github.com/notifications/unsubscribe-auth/ABVAEYzvEEY1vObUeOn49yDPoFnTugfIks5uwseQgaJpZM4X6mSK .

gidden avatar Nov 19 '18 15:11 gidden