DimensionalData.jl icon indicating copy to clipboard operation
DimensionalData.jl copied to clipboard

Make `DimArray`s functors

Open kapple19 opened this issue 4 months ago • 9 comments

Once interpolation is implemented (#420), can we make DimArray instances act as functors?

As an example and demonstration of available syntax:

julia> A = rand(X(1:3), Y(1:4), Z(1:5))
┌ 3×4×5 DimArray{Float64, 3} ┐
├────────────────────────────┴───────────────────── dims ┐
  ↓ X Sampled{Int64} 1:3 ForwardOrdered Regular Points,
  → Y Sampled{Int64} 1:4 ForwardOrdered Regular Points,
  ↗ Z Sampled{Int64} 1:5 ForwardOrdered Regular Points
└────────────────────────────────────────────────────────┘
[:, :, 1]
 ↓ →  1         2         3          4
 1    0.461538  0.781401  0.0116387  0.978326
 2    0.516508  0.54805   0.230822   0.574886
 3    0.886503  0.814381  0.286989   0.574057

julia> A(1.5, 2.5, 3.5)
ERROR: MethodError: objects of type DimArray{Float64, 3, Tuple{X{DimensionalData.Dimensions.Lookups.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.Lookups.ForwardOrdered, DimensionalData.Dimensions.Lookups.Regular{Int64}, DimensionalData.Dimensions.Lookups.Points, DimensionalData.Dimensions.Lookups.NoMetadata}}, Y{DimensionalData.Dimensions.Lookups.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.Lookups.ForwardOrdered, DimensionalData.Dimensions.Lookups.Regular{Int64}, DimensionalData.Dimensions.Lookups.Points, DimensionalData.Dimensions.Lookups.NoMetadata}}, Z{DimensionalData.Dimensions.Lookups.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.Lookups.ForwardOrdered, DimensionalData.Dimensions.Lookups.Regular{Int64}, DimensionalData.Dimensions.Lookups.Points, DimensionalData.Dimensions.Lookups.NoMetadata}}}, Tuple{}, Array{Float64, 3}, DimensionalData.NoName, DimensionalData.Dimensions.Lookups.NoMetadata} are not callable
Use square brackets [] for indexing an Array.
The object of type `DimArray{Float64, 3, Tuple{X{DimensionalData.Dimensions.Lookups.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.Lookups.ForwardOrdered, DimensionalData.Dimensions.Lookups.Regular{Int64}, DimensionalData.Dimensions.Lookups.Points, DimensionalData.Dimensions.Lookups.NoMetadata}}, Y{DimensionalData.Dimensions.Lookups.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.Lookups.ForwardOrdered, DimensionalData.Dimensions.Lookups.Regular{Int64}, DimensionalData.Dimensions.Lookups.Points, DimensionalData.Dimensions.Lookups.NoMetadata}}, Z{DimensionalData.Dimensions.Lookups.Sampled{Int64, UnitRange{Int64}, DimensionalData.Dimensions.Lookups.ForwardOrdered, DimensionalData.Dimensions.Lookups.Regular{Int64}, DimensionalData.Dimensions.Lookups.Points, DimensionalData.Dimensions.Lookups.NoMetadata}}}, Tuple{}, Array{Float64, 3}, DimensionalData.NoName, DimensionalData.Dimensions.Lookups.NoMetadata}` exists, but no method is defined for this combination of argument types when trying to treat it as a callable object.
Stacktrace:
 [1] top-level scope
   @ REPL[11]:1

The order of arguments would be the order of dims.

The DimArray construction would need to receive interpolation/extrapolation options to pass to the interpolator constructor.

kapple19 avatar Aug 15 '25 06:08 kapple19

Yeah, that is interesting.

So the functor syntax is basically At for Points or Categorial lookups, and Contains for intervals?

If interpolation was specified for a lookup then it would be e.g. an Interp selector? It would be good to have the getindex Selector based method available as well, as it lets you mix and match with other indexing styles.

And I guess selectors could still be used within the functor syntax, if e.g. Near was prefered to At or interpolation.

rafaqz avatar Aug 15 '25 07:08 rafaqz

Okay so the more creative functionality of functor signatures was also on my mind, and I see that you're also thinking along the same lines! 😁

That's right, so A(x, y, z) for example would use At on all dimensions, and could the default. If one or more values are not in the lookup, it can default to Near if the interpolation extension is not loaded, or an Interp selector if it is loaded. Something like that?

kapple19 avatar Aug 15 '25 10:08 kapple19

Great. But its best to have no changes based on extensions loading - that can be triggered by another package unintentionally, and answers changing for the same code after running using Package is pretty hard to debug. Near should always be specified manually too. Basically as a rule all of the magic in this package has to at least be predictable to users based on the line of code they run.

For intervals we really want Contains because At isn't so meaningful.

rafaqz avatar Aug 16 '25 00:08 rafaqz

So if I understand correctly, we can design it such that if a user has both DimensionalData and DataInterpolations loaded (which triggers the package extension) then there will be extra methods provided for DimArray constructors that accept configuration options for interpolation & extrapolation. Is that right?

Something like

data = DimVector(
    [0.0, 10, -10],
    [0.0, 1, 2] |> X;
    name = :y,
    interpolation = LinearInterpolation,
    extrapolation = ExtrapolationType.Periodic
)

data(0.5) # 5.0
data(-1) # ...something
data(5)

and

data = DimArray(
    x * sin(y)
    for x in 0 : 0.1 : 1 |> X,
        y in -1 : 0.2 : 1 |> Y;
    name = :z,
    interpolation = LinearInterpolation,
    extrapolation = ExtrapolationType.Periodic
)

data(0.1, 0.2)
data(0.05, 0.04)
data(-10, -20)

Do we want functor syntax in the base package for non-interpolation methods? I'm not sure how they can coexist with the functor interpolation methods, since it needs to check if it's an item for At first.

I think it's best to have the package extension define the functor methods.

I haven't thought about intervals enough yet, but that can coexist by being defined in the base package on Intervals and then methods defined in the extension are for Reals (or Numbers?).

kapple19 avatar Aug 24 '25 08:08 kapple19

I would say no!

The thing with this package is half the users don't often use DimArray!

So adding new struct fields doesn't give us a lot. AbstractDimArray and the dims Tuple are the core concepts here. If we add something to dims it works everywhere.

Additionally, you aren't specifying which dimensions to interpolate, what happens when one dimension is ordered categorical, or time? The key requirement is arbitrarily mixing selector behavior.

So the first most obvious thing is to add an Interpolated(val) selector. It can take the the interpolation and extrapolation options as arguments or keywords.

But I'm guessing you also want some precalculation/setup to happen, and simpler syntax. So that could be added to Sampled lookups as a new, field so doing Interpolated(val) will just work (with the predefined interpolation/extrapolation for that specific dimensions).

rafaqz avatar Aug 24 '25 10:08 rafaqz

For me the function syntax and interpolation are mostly orthogonal concepts.

They should work together, but in the same way everything else does.

Probably the key point of integration would be that if a Sampled lookup has a non-nothing interpolation field it would default to using interpolation instead of At or Contains.

rafaqz avatar Aug 24 '25 10:08 rafaqz

I don't totally know if this will fit with how Data interpolations.jl works. That's why it hasn't just been done already.

But applying interpolation to only some dimensions is going to be important for this not to be a limited "interpolate everything" use case, which isn't really how the package works normally.

I think it means we need to return some kind of interpolator indexing object from dims2indices that will be applied to the view left after indexing with the other selectors.

rafaqz avatar Aug 24 '25 10:08 rafaqz

Looking at the other ask about wrapping dimensions in coords...it could make sense, to have an Interpolation wrapper lookup that can wrap a dimension and work in a multidimensional way.

asinghvi17 avatar Aug 24 '25 11:08 asinghvi17

I think we can do the multidimensional part in internals of AbstractDimArray indexing. Like after return from dims2indices, when every ofther index else is basically Int/Array/Colon, we will have these Interpolator objects still in the mix.

We can dispatch on that and organise the multidimensional interpolation from there by taking a view with all other indices and colons for out interpolators, then interpolating into the result.

(Or do you mean at the point of defining them? Is there anything that cant happen separately per dimension?)

rafaqz avatar Aug 24 '25 12:08 rafaqz