Ryan Abernathey

Results 1176 comments of Ryan Abernathey

Thanks for reporting Sean. This is ringing a bell somewhere deep in my brain, but I can't seem to find an issue for it anywhere. @martindurant - any thoughts?

Perhaps we want to define a recipe that will create a kerchunk-virtual-zarr that always points to the "latest" data, whatever that means. Kind of similar to the ideas explored in...

@martindurant - the code above worked quite well out of the box! It took about 20s to index the file, and I didn't even run it from inside Azure. Whether...

Here is what the index file looks like ``` {"domain": "g", "date": "20220126", "time": "0000", "expver": "0001", "class": "od", "type": "pf", "stream": "enfo", "step": "0", "levtype": "sfc", "number": "29", "param":...

> then this might indeed be the case that we don't need cfgrib even with the index file, it seems like there must be some metadata in the grib file...

So perhaps we want to add an optional `index_file` argument to `scan_grib`, which would provide a fast path to constructing the references.

Tom, great example of Zarr hacking! I agree you are abusing Zarr filters though. 😬 Here is a related approach I tried for a different a custom binary file format:...

> This is definitely part of the long-term kerchunk vision - it would work particularly well when chunks map to some file naming scheme. To clarify, do you imagine including...

> Anecdotally, running with these additional logs indicated that the hang occurred most often at the call to `np.asarray` So this is the point where we are actually reading data....

😩 I agree this is a major blocker. We should work hard to find a minimal reproducer (without Pangeo Forge) that we can take to fsspec or h5py to reproduce...