NSIDC-Data-Tutorials icon indicating copy to clipboard operation
NSIDC-Data-Tutorials copied to clipboard

Define a common `read_h5` function for _h5py + pandas_ and _dask array_

Open andypbarrett opened this issue 5 years ago • 5 comments

def read_h5(fname, vnames=[]):
    """Read a list of vars [v1, v2, ..] -> 2D."""
    f = h5py.File(fname, 'r')
    return np.column_stack([f[v][()] for v in vnames])

could be used for the pandas and dask array cells. Maybe this could be added to icepyx or offered as part of a separate tool set.

andypbarrett avatar Aug 13 '20 23:08 andypbarrett

@andypbarrett Has this been suggested to the icepyx project? Is there still value in us pursuing this within IceFlow and/or any of our other tutorial notebooks?

asteiker avatar Aug 02 '21 18:08 asteiker

I don't know about icepyx. It has been a while since I have attended.

However, reading H5 files is problematic because the structure is not consistent. Many of the ATL?? files have the data buried in groups or groups, which makes simple general solutions difficult to find.

@betolink and I were discussing using fsspec or something similar to hardcode a recipe to read the ICESat-2 files.

andypbarrett avatar Aug 03 '21 22:08 andypbarrett

I think your Issue was originally filed in reference to IceFlow. So perhaps this is better left under their backlog versus an addition to our Tutorials repo itself?

asteiker avatar Aug 05 '21 18:08 asteiker

Reading IS2 HDF files using a common data model is beyond IceFlow. Many projects could benefit from these capabilities. I also wonder what's the status with Icepyx. That's probably the best place to implement it.

betolink avatar Aug 06 '21 15:08 betolink

CRYO-199

asteiker avatar Jan 23 '24 19:01 asteiker