DimensionalData.jl
DimensionalData.jl copied to clipboard
Stacking dimensions + combining variables to new dimension
Having dabbled in xarray for a bit for a project, I found two features quite useful that I wonder if DimensionalData has support for / could be supported?
- DataArray.stack / unstack: Combine any number of existing dimensions into a single new dimension, and the inverse.
- Dataset.to_dataarray / DataArray.to_dataset: Combine the variables in a
Dataset(DimStack) into a new coordinate, and the inverse.
These two operations combined can be useful for data analysis applications. For example, suppose you have a DimStack of temperature and pressure in lon x lat x time. You can easily analyze the data in PCA space by converting the variables to a variable dimension, then stacking the lon, lat, variable dimensions to get you a features x time matrix. After PCA and additional operations in PCA space, the inverse operations can reconstruct the original DimStack. Thoughts?
-
We already have this, see
mergedims. -
DimArray(stack)makes a DimArray of NamedTuple over the stacks layers. This is preferable in Julia because it lets the types remain mixed, where an array loses that. And layers are often not all the same type.
For 1, there is this open issue I made some time ago: https://github.com/rafaqz/DimensionalData.jl/issues/877, when I had to actually unstack/unmergedims from a dimensions with yearmonths and realized how hard it was to do.
True. unmergedims could also reconstruct missing rows like the Tables.jl implementation does
Thanks for the responses! I'll look into these options
No worries, please add to Tiems issue or here with any problems you have with functionality or docs. A comparison to xarray is good to improve things here! Mostly @sethaxen wrote mergdims they may have some insights as well.