ComponentArrays.jl icon indicating copy to clipboard operation
ComponentArrays.jl copied to clipboard

Symbolic indexing should slice, not view

Open jonniedie opened this issue 3 years ago • 16 comments

getproperty should continue to be a maybeview situation, but getindex should always create a copy, even when one or more of the indices are symbols. The way it is right now, getindex just follows getproperty when there are symbol indices and regular slicing when there are none. This is kinda weird and surprising. Views should be explicitly made with @view, view, or Base.maybeview.

jonniedie avatar May 26 '21 04:05 jonniedie

Hey! This broke my code! :(

bgroenks96 avatar Jun 01 '21 13:06 bgroenks96

I'm not sure I agree that it's surprising... but I can see how maybe this is more consistent with Julia array behavior.

That being said, I would still argue that getindex with a symbol should behave exactly like getproperty. This is what I expected and how I wrote my code.

I think it's a matter of philosophy whether the ComponentArray should behave like a data structure or purely as an array.

bgroenks96 avatar Jun 01 '21 13:06 bgroenks96

Sorry this broke your code! I've been trying to keep breaking changes to a minimum, but the last few changes that involved indexing/viewing have been something that's needed to happen for a while for a few reasons.

First is consistency with the array interface. There is a lot of room to play in whether ComponentArrays should be more struct-like or array-like. While ComponentArrays are supposed to act more like structs for the user, it's important that they follow the array interface in order to work inside of solvers. For a while, ComponentArrays always viewed, even when using numerical indices. This ended up causing some bugs because solvers (and things that use arrays in general) expect getindex on <:AbstractArrays to always create copies. So I changed it such that purely numerical/colon indexing would slice, but symbolic indexing would view. But then what happens when you have a higher-dimensional ComponentArray like a ComponentMatrix and one of the indices is a number or colon and the other is a symbol? Should it slice or view? It is much more consistent to just say getindex always slices and view always views.

Which brings up the second issue: How do you slice with symbolic indices if that's what you want? With the old behavior, there was no way to slice with symbolic indices. You had to view and then create a copy of the view. With the new behavior, slicing and viewing is done explicitly and works no matter what type of indices you have.

Again, I apologize for this breaking your code. It's not a decision I made lightly (and it actually broke some of my code as well, because I was depending on that behavior in a few cases). Thankfully, mine only took a few @views sprinkled here and there. Hopefully yours will be similar.

jonniedie avatar Jun 01 '21 14:06 jonniedie

I do symbolic indexing programmatically, i.e. I loop over variable names and then do getindex to write to the sub-array, e.g:

    # set up default mass matrix
    M_diag = similar(setup.uproto)
    for layer in keys(setup.meta)
        # setup.meta is a named tuple
        progvars = setup.meta[layer][:progvars]
        algvars = setup.meta[layer][:algvars]
        for var in progvars
            M_diag[layer][varname(var)] .= one(eltype(M_diag))
        end
        for var in algvars
            M_diag[layer][varname(var)] .= zero(eltype(M_diag))
        end
    end

By hand via getproperty this would be something like M_diag.soil.T .= 1.0, as an example. So the issue here for me is that I use getindex as the programmatic equivalent of getproperty... though I just realized that I could presumably also invoke getproperty, though that's a bit ugly.

bgroenks96 avatar Jun 01 '21 15:06 bgroenks96

Try:

    # set up default mass matrix
    M_diag = similar(setup.uproto)
    for layer in keys(setup.meta)
        # setup.meta is a named tuple
        progvars = setup.meta[layer][:progvars]
        algvars = setup.meta[layer][:algvars]
        diag_layer = @view M_diag[layer]
        for var in progvars
            diag_layer[varname(var)] .= one(eltype(M_diag))
        end
        for var in algvars
            diag_layer[varname(var)] .= zero(eltype(M_diag))
        end
    end

jonniedie avatar Jun 01 '21 15:06 jonniedie

diag_layer would also be a CompnentArray so I would have to make that a view too, I guess.

bgroenks96 avatar Jun 01 '21 15:06 bgroenks96

You don't have to view that because setindex! always sets into the array instead of a copy of the array

jonniedie avatar Jun 01 '21 15:06 jonniedie

Oh, btw, there's a function called valkeys that's exported that will get Val-wrapped keys from your ComponentArray so that indexing is much faster than keys that you get by using keys.

jonniedie avatar Jun 01 '21 15:06 jonniedie

I feel like I tested this before and observed no difference in performance? But maybe I tested getproperty and not getindex... I don't remember.

bgroenks96 avatar Jun 01 '21 15:06 bgroenks96

Anyway, this particular code snippet is not performance critical so no worries about that :)

bgroenks96 avatar Jun 01 '21 15:06 bgroenks96

Yeah, if you tested with getproperty, there would be no difference in performance because getproperty is able to constant propagate but getindex for some reason isn't (even though the code is the exact same for both on my end). It's a long-standing issue for ComponentArrays.

jonniedie avatar Jun 01 '21 15:06 jonniedie

Probably Julia has special handling for getproperty. Have you tried benchmarking with getproperty invoked directly, maybe also with a non-constant argument?

bgroenks96 avatar Jun 01 '21 15:06 bgroenks96

By the way, another side note, I actually don't know if it's really so critical that ComponentArray works inside of a solver. I currently just store the axis information and then invoke the direct constructor ComponentArray(array, axis) on each step, which seems to be more or less cost-free (though please correct me if I am wrong). This lets you use regular array types in the solver but ComponentArrays in your code which feels like a best of both worlds to me.

bgroenks96 avatar Jun 01 '21 15:06 bgroenks96

This is a good method to have around as a backup, but I think it would be unfortunate if that were the only way you could use ComponentArrays in a solver. There's a fair amount of boilerplate doing it that way and I think it just makes things less nice to work with in general. Plus non-solver things like state-space systems with named states (see #88) also require ComponentArrays to keep their Component-ness through array-like operations.

jonniedie avatar Jun 01 '21 18:06 jonniedie

This came up again on Slack today, so I'm going to go ahead and re-open this up to get some more opinions on it. I want to make sure this is the best indexing behavior before cementing it in place for v1.0 (which should come within the next few months?).

jonniedie avatar Jun 02 '21 23:06 jonniedie

After thinking about it more, I think I agree with this change. Even for struct-like behavior, getindex is not defined by default, and when it is, it typically signals array-like behavior. Since the default behavior of getindex is usually to slice (as much as I vehemently disagree with this decision by the Julia devs), it makes sense that getindex with a symbol should follow this convention.

It was annoying that my code broke, but I can't think of any good argument against this change.

bgroenks96 avatar Jun 08 '21 14:06 bgroenks96