Ryan Abernathey

Results 265 issues of Ryan Abernathey

While working on #2031 I became familiar with the new V3 Codec API and its peculiarities. And I saw that we don't yet have actual unit tests for the codecs....

V3

I've been playing with encoding larger datasets using the cf-xarray geometry approach. Here's some code ```python import geopandas as gp import xvec import xarray as xr url = ( "s3://overturemaps-us-west-2/release/2024-08-20.0/theme=buildings/type=building/"...

Thanks so much for this amazing package! I just tried it out with some [Icechunk](https://icechunk.io/en/latest/) data and it worked great right out of the box! However, I am having one...

bug

Thanks for this great open source project! 🙏 I know that tensors are not supported yet, but I wanted to open an issue to enquire about their status on the...

**Describe the issue**: I have been observing a mild but consistent increase in memory when storing large arrays to Zarr. I have reproduced this with both s3fs (my real use...

bug
help wanted
memory

- [x] Closes #9792 - [ ] Tests added - [ ] User visible changes (including notable bug fixes) are documented in `whats-new.rst` - [ ] New functions/methods are listed...

topic-documentation

Zarr has its own internal mechanism for caching, described here: - https://zarr.readthedocs.io/en/stable/tutorial.html#distributed-cloud-storage - https://zarr.readthedocs.io/en/stable/api/storage.html#zarr.storage.LRUStoreCache However, this capability is currently inaccessible from xarray. I propose to add a new keyword `cache=True/False`...

topic-documentation
topic-zarr

This works with V2 data: ```python zarr.create(shape=10, dtype=">i2", zarr_version=2) # -> i2> ``` But raises for V3 ```python zarr.create(shape=10, dtype=">i2", zarr_version=3) ``` ``` File ~/gh/zarr-developers/zarr-python/src/zarr/codecs/__init__.py:40, in _get_default_array_bytes_codec(np_dtype) 37 def _get_default_array_bytes_codec(...

bug

We created this package because Dask was unable to rechunk arrays in a scalable way. The motivation is described in this five-year-old blog post: https://discourse.pangeo.io/t/best-practices-to-go-from-1000s-of-netcdf-files-to-analyses-on-a-hpc-cluster Since then, Dask has overhauled...

I'm working with the GPM-IMERG files from NASA. Here's an example ```python import xarray as xr import fsspec url = "https://earthmover-sample-data.s3.us-east-1.amazonaws.com/hdf5/3B-HHR.MS.MRG.3IMERG.19980101-S000000-E002959.0000.V07B.HDF5" ds_nc = xr.open_dataset(fsspec.open(url).open(), engine="h5netcdf", group="Grid", decode_coords="all") print(ds) ``` ```...

bug
Kerchunk
CF conventions
parsers