DimensionalData.jl `findmax` does not work on DimStacks

MWE:

using DimensionalData
A = [1.0 2.0 3.0;
     4.0 5.0 6.0]
x, y, z = X([:a, :b]), Y(10.0:10.0:30.0; metadata=Dict()), Z()
dimz = x, y
da1 = DimArray(A, (x, y); name=:one, metadata=Metadata())
da2 = DimArray(Float32.(2A), (x, y); name=:two)
da3 = DimArray(Int.(3A), (x, y); name=:three)
da4 = DimArray(cat(4A, 5A, 6A, 7A; dims=3), (x, y, z); name=:extradim)

s = DimStack((da1, da2, da3))

maximum(s) # fine
findmax(s) # errors

Error:


julia> maximum(s) # fine
(one = 6.0, two = 12.0f0, three = 18)

julia> findmax(s) # errors
ERROR: MethodError: no method matching isless(::DimMatrix{…}, ::DimMatrix{…})
The function `isless` exists, but no method is defined for this combination of argument types.

Closest candidates are:
  isless(::Missing, ::Any)
   @ Base missing.jl:87
  isless(::Any, ::Missing)
   @ Base missing.jl:88
  isless(::Base.CoreLogging.LogLevel, ::Base.CoreLogging.LogLevel)
   @ Base logging/logging.jl:131
  ...

Stacktrace:
  [1] _rf_findmax(::Tuple{DimMatrix{…}, Symbol}, ::Tuple{DimMatrix{…}, Symbol})
    @ Base ./reduce.jl:907
  [2] BottomRF
    @ ./reduce.jl:86 [inlined]
  [3] MappingRF (repeats 2 times)
    @ ./reduce.jl:100 [inlined]
  [4] _foldl_impl(op::Base.MappingRF{…}, init::Base._InitialValue, itr::Base.Iterators.Zip{…})
    @ Base ./reduce.jl:62
  [5] foldl_impl(op::Base.MappingRF{…}, nt::Base._InitialValue, itr::Base.Iterators.Zip{…})
    @ Base ./reduce.jl:48
  [6] mapfoldl_impl(f::Base.var"#353#354"{…}, op::typeof(Base._rf_findmax), nt::Base._InitialValue, itr::Base.Generator{…})
    @ Base ./reduce.jl:44
  [7] mapfoldl(f::Function, op::Function, itr::Base.Generator{…}; init::Base._InitialValue)
    @ Base ./reduce.jl:175
  [8] mapfoldl
    @ ./reduce.jl:175 [inlined]
  [9] _findmax
    @ ./reduce.jl:906 [inlined]
 [10] findmax(f::Function, domain::DimStack{…})
    @ Base ./reduce.jl:905
 [11] _findmax(a::DimStack{…}, ::Colon)
    @ Base ./reduce.jl:935
 [12] findmax(itr::DimStack{…})
    @ Base ./reduce.jl:934
 [13] top-level scope
    @ REPL[265]:1
Some type information was truncated. Use `show(err)` to see complete types.

If this is intended, we should dispatch on findmax and throw a better error message. If not - how would this work? Returning a namedtuple of selectors for each layer seems a bit arbitrary...

Nov 12 '24 15:11 asinghvi17

btw, I don't need this, just an edge case I stumbled into.

Nov 12 '24 15:11 asinghvi17

Oh no now reductions are kind of broken if you expect the new iteration :crying_cat_face:

The NamedTuple/Array crossover is endless pain, every change breaks some other expectation

Nov 12 '24 17:11 rafaqz

But doing this over a NameTuple iterator is totally useless, it will just ~~error~~ return a weird result:

julia> maximum(s) # fine
(one = 6.0, two = 12.0f0, three = 18)

Its also generally useless over multi-value objects like arrays, because you just end up with a single object that was chosen as the "maximum" for some arcane reason or other (maximum first value in this case), rather than the maximum of each separate position in the vector.

TLDR we can't have findmax find a single Int and have maximum work like this too, unless findmax returns a NamedTuple of indices in each separate layer. Or, it just looks at the first layer, breaking with how the other reducing methods work.

Nov 12 '24 17:11 rafaqz

So this is the problem:

julia> maximum([(a=1, b=200), (a=2, b=1)])
(a = 2, b = 1)

We may need a slightly more formal definition of what a DimStack is and how it operates under iteration and reduction. Its just hard to know what definition is the most useful practically until you try them out for a year

Nov 12 '24 17:11 rafaqz

This is what DimStack is currently:

`Array` of `NamedTuple`

iteration and indexing: one single Array of NamedTuple

`NamedTuple` of `Array`

reduction: multiple separate arrays of different lengths reduced separately
broadcast_dims: separate broadcasts over each layer like reductions
merge and Symbol indexing/getproperty: separate layers like a NamedTuple of Array

Both

tables: Row-wise A of NT, Col-wise NT of A (because looped to be the same length)

Undefined

broadcast: ?

So, still pretty weird. I find sum(dimstack) to be an odd outcome, and different to summing the iterator unless all layers are the same size.

findmax just doesn't really fit in at all.

I always wondered if it was trying to do too much in one object, and should just be two that you can switch between.

Nov 13 '24 00:11 rafaqz

`findmax` does not work on DimStacks

Array of NamedTuple

NamedTuple of Array

Both

Undefined

`Array` of `NamedTuple`

`NamedTuple` of `Array`