DimensionalData.jl icon indicating copy to clipboard operation
DimensionalData.jl copied to clipboard

Make an IO interface for DimensionalData

Open felixcremer opened this issue 7 months ago • 7 comments

The idea would be to get inspiration from YAXArrayBase to enable saving DimArrays to different data backends.

We should separate the IO interface from the data convention interface.

felixcremer avatar May 07 '25 15:05 felixcremer

DimTree should be the main target of this data loader. Then you can only read parts of the data by giving a keyword.

felixcremer avatar May 07 '25 15:05 felixcremer

So essentially merging Rasters and YAXArrayBase file loading mechanisms.

It would be extensible by using some runtime storage to add backends to a list.

Another option is dispatch on Val{:ext} to remove the runtime aspect.

The simple canonical case would be HDF5 support, which could go in an extension here.

One problem is GDAL loads a million things, so Rasters has it as a fallback rather than listing them all.

rafaqz avatar May 07 '25 16:05 rafaqz

For the GDAL fallback I think we could add the regex r".*" that matches with everything and point that to the GDAL data type. If we would push that to the end of the backend to package list we can use it as a catch all. I think that should be added by Rasters.

felixcremer avatar May 07 '25 21:05 felixcremer

One problem is DD has no concept of missing values other than base Missing, but most of this data has some sentinel missing value.

Rasters adds missingval to handle this in case people don't want missing slowing things down. Raster skipmissing can be much faster than base for this reason.

What does YAX do?

rafaqz avatar May 08 '25 11:05 rafaqz

This might be specific to the HDF5/NetCDF4 extension, but I'd just like to mention that it'd be really nice if it was possible to save/load DimArrays/DimStacks to arbitrary groups within an HDF5 file rather than overwriting the entire file. Xarray's .to_netcdf() does it using the group argument. I use it frequently and it's very handy.

JamesWrigley avatar May 08 '25 22:05 JamesWrigley

That would be nice! Would it just take a string argument separated by slashes?

asinghvi17 avatar May 08 '25 22:05 asinghvi17

Yeah I think that would do the trick 👍

JamesWrigley avatar May 09 '25 08:05 JamesWrigley