Rasters.jl
Rasters.jl copied to clipboard
handle zarr files
xarray writes zarr files with additional attributes. We should be able to read and write those attributes.
@rafaqz how big of lift do you think this would be? I’m too new to Julia to be of much use (yet) but I’d really like to start using your package to build tools for analyzing massive S3 hosted zarr arrays .. which I see as the future of geo raster data… we’re currently using xarray but i’d like to move away from python. Anyway… amazing work
Zarr is actually supported by netcdf now. So all we have to do is add .zarr to the list of netcdf file types. Making it efficient may be another story, as DiskArrays.jl chunking is not implemented for NCDatasets.jl, that wraps netcdf.
Also, thanks for the support. But excercise some caution switching from Python, and make sure the other benefits are clear. These are immature and unfunded compared to python raster tools.
The ease of development means probably we can match xarray/rasterio in a few years, but you will need to be prepared to make bug reports and work through some problems.
@rafaqz thanks for your thoughts and for tempering my expectations... I'll play around with package as is to test performance. Very keen to see Julia become the de-facto tool for analyzing RS data but I see there is still a lot of work to be done. I'll also look around for call that might provide funding to advance capabilities.
Be sure to make issues here for anything that is not competetive with Python. There is a new PR aboit to merge that will improve a lot, but lots more use and feedback will be needed to get there.
@rafaqz I'm sure you've already stumbled on this package but it has similar goals with some attention given to Zarr format: https://github.com/meggart/YAXArrays.jl
Yes, we actually collaborate on DiskArrays.jl that is underneath both packages. But the ecosystem still has no settled on raster packages yet.
Just to follow up. I was unsuccessful at reading Zarr files with ArchGDAL or Rasters .... I suspect this functionality will come in the near future
Try NCDatasets.jl...
The zarr extension hasnt been added here, as I said above?
@rafaqz thanks for the suggestions. I've also had some success with Yaxarry.jl and ESDL.jl but I'm still fumbling around in the dark early in the learning curve... but at least I can see shadows now ;-)
This works now but runs into the efficiency problems from ZarrDatasets.jl.
This only works for reading at least that was what my PR implemented.
What's the efficiency problem?
It's the same old CommonDataModel / DiskArrays slowness thing I think.
NCDatasets jl isn't slow tho, probably we need to fix ZarrDatasets
I'd also like the capability to instantiate a Raster from either a ZarrDataset or a Zarr array...I can probably implement this but commenting so I have somewhere to reference.
What's the best way to do that? The Raster constructor might need some overloads as well.
I think this already works on one of the branches
@felixcremer mentioned on Slack that Zarr CRS specs are supposed to live in global metadata under a key.
We should integrate that into the cf branch at some point.