Rasters.jl icon indicating copy to clipboard operation
Rasters.jl copied to clipboard

Rasters does not support constructing RasterSeries when rasters contain singleton time dimension

Open alex-s-gardner opened this issue 9 months ago • 8 comments

There are a lot of cases where raster data is provided with a singleton time dimension, e.g.:

Raster(rand(X(1:4), Y(1:4), Ti(1))) 

╭─────────────────────────╮
│ 4×4×1 Raster{Float64,3} │
├─────────────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────── dims ┐
  ↓ X  Sampled{Int64} 1:4 ForwardOrdered Regular Points,
  → Y  Sampled{Int64} 1:4 ForwardOrdered Regular Points,
  ↗ Ti
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

I would be nice if RasterSeries could accept a list of files, using the granule Ti dimension as the RasterSeries dimension.

See discussion here: https://julialang.slack.com/archives/C9XBLUCVB/p1739382396826909

alex-s-gardner avatar Feb 12 '25 20:02 alex-s-gardner

A MWE of the behaviour one wants is:

m = Raster(rand(X(1:4), Y(1:4), Ti(1:100))
dataset = RasterSeries([m[Ti(i:i)] for i in axes(m, Ti)], Ti)

combine(dataset, Ti) == m

which does not currently happen.

asinghvi17 avatar Feb 12 '25 20:02 asinghvi17

this is something that yax does https://juliadatacubes.github.io/YAXArrays.jl/dev/UserGuide/read.html#along-a-new-dimension, not sure if that helps, or this is more on enhancing RasterSeries.

lazarusA avatar Feb 12 '25 20:02 lazarusA

You can drop the extra dim that gets formed and then rebuild the entire thing with the new dims you want...except the dropping doesn't work with diskarrays:

m = Raster(rand(X(1:4), Y(1:4), Ti(1:100)))
dataset = RasterSeries([m[Ti(i:i)] for i in axes(m, Ti)], :time)

cm2 = combine(dataset, :time)
all_tis = only.(dims.(dataset.data.data, (Ti,)))

rebuild(
    dropdims(
        combine(dataset, Rasters.DD.Dim{:time}); 
        dims = Ti
    ); 
    dims = (
        dims(cm2)[1:2]..., Ti(Rasters.DD.Sampled(all_tis))
    ) |> Rasters.DD.format
)

However, this does work (but adds an extra dim):

rebuild(
        cm2; 
    dims = (
        dims(cm2)[1:2]..., Dim{:useless}([1]), Ti(Rasters.DD.Sampled(all_tis))
    ) |> Rasters.DD.format
)
┌ 4×4×1×100 Raster{Float64, 4} ┐
├──────────────────────────────┴──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── dims ┐
  ↓ X       Sampled{Int64} 1:4 ForwardOrdered Regular Points,
  → Y       Sampled{Int64} 1:4 ForwardOrdered Regular Points,
  ↗ useless Sampled{Int64} [1] ForwardOrdered Irregular Points,
  ⬔ Ti      Sampled{Int64} [1, 2, …, 99, 100] ForwardOrdered Irregular Points
├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── raster ┤
  extent: Extent(X = (1, 4), Y = (1, 4), useless = (1, 1), Ti = (1, 100))
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
[:, :, 1, 1]
 ↓ →  1         2         3         4
 1    0.08388   0.270138  0.452117  0.673822
 2    0.1778    0.681501  0.458192  0.166061
 3    0.662556  0.772683  0.491663  0.668206
 4    0.128367  0.485971  0.815086  0.0306872

asinghvi17 avatar Feb 12 '25 22:02 asinghvi17

The simple way to test this with DiskArrays is:

dataset = RasterSeries([m[Ti(i:i)] |> Rasters.DiskArrays.cache for i in axes(m, Ti)], :time)

(note that we're wrapping the dimarray in a cacheddiskarray here)

This gives you a rasterseries full of diskarrays

┌ 100-element RasterSeries{Raster,1} ┐ ├────────────────────────────── dims ┤ ↓ time ├──────────────────────────── raster ┤ extent: Extent(X = (1, 4), Y = (1, 4), Ti = (1, 1)) └────────────────────────────────────┘ 4×4×1 CachedDiskArray{Float64, 3, Array{Float64, 3}, LRU{ChunkIndex{3, OffsetChunks}, OffsetArray{Float64, 3, Array{Float64, 3}}}} 4×4×1 CachedDiskArray{Float64, 3, Array{Float64, 3}, LRU{ChunkIndex{3, OffsetChunks}, OffsetArray{Float64, 3, Array{Float64, 3}}}} 4×4×1 CachedDiskArray{Float64, 3, Array{Float64, 3}, LRU{ChunkIndex{3, OffsetChunks}, OffsetArray{Float64, 3, Array{Float64, 3}}}} 4×4×1 CachedDiskArray{Float64, 3, Array{Float64, 3}, LRU{ChunkIndex{3, OffsetChunks}, OffsetArray{Float64, 3, Array{Float64, 3}}}}

asinghvi17 avatar Feb 12 '25 22:02 asinghvi17

@asinghvi17 Don't you want a view? That disk array will be materialised.

But yeah, RasterSeries should just do that automatically if you pass an empty Ti() dim

rafaqz avatar Feb 12 '25 23:02 rafaqz

Yeah but that's just construction...here we're starting out with a vector of 3d arrays that are lazy

asinghvi17 avatar Feb 12 '25 23:02 asinghvi17

Or even better, if RasterSeries isn't handed a dimension... and the files have a singleton dimension then they are automatically concatenated along that dimension

alex-s-gardner avatar Feb 13 '25 00:02 alex-s-gardner

Good idea we could totally detect singletons

rafaqz avatar Feb 13 '25 10:02 rafaqz