Functors.jl Use the cache less often

One of the things #32 proposes is to disable the use of the cache for some types, so that e.g. the number 4 appearing at two nodes is not regarded as a form of parameter sharing, just a co-incidence. This PR wants to make that change alone.

But what exactly is the right rule here? If (x = [1,2,3], y = 4) appears twice, then the shared [1,2,3] should be cached, but I think the 4 should still not be.

m1 = (a = (x = [1,2,3], y = 4), b = (x = [1,2,3], y = 4))
m1.a.x !== m1.b.x  # distinct arrays, hence not tied
m1.a.y === m1.b.y  # these are isbits, should not be tied

shared = [1,2,3]
m2 = (a = (x = shared, y = 4), b = (x = shared, y = 4))
m2.a.x === m2.b.x  # these really are tied
m2.a === m2.b  # but these should not be, since they would tie 4 === 4

m3 = (a = [[1,2,3], 4], b = [[1,2,3], 4])
m4 = (a = [shared, 4], b = [shared, 4])  # should still not tie 4 === 4
m4.a !== m4.b  # but now the container won't tempt you

This PR thinks that the right test is to only use cache on leaf nodes. #32 tested instead !isbits(x) && ismutable(x), which will also work on these examples. Where they differ is on an immutable container enclosing a mutable array:

shared = NamedDimsArray([1,2,3], :tri)  # a Functors-unaware array wrapper struct
shared = TiedArray(SA[1,2,3], Ref(0))   # the idea we had for marking StaticArrays as tied

Right now this uses the exclude keyword not the fixed isleaf. I think that makes sense but haven't thought too hard.

Feb 12 '22 16:02 mcabbott

I thought about only considering leaves for caching, but one hiccup is when a mutable struct is used as a shared non-leaf node. Think Flux.BatchNorm. We could say that non-leaf nodes will always be untied, but that needs to be a) decided and b) documented with a prominent warning banner.

Feb 12 '22 17:02 ToucheSir

It seems that by making your type non-leaf, you have opted in to having Functors traverse into it. What's gained by not doing so? Can you clarify what problem traversing it will cause?

My NamedDimsArray example is my complaint about ismutable, distilled. Could be OffsetArray, too. This ought to work just like an Array.

Since TiedArray is more explicitly about Functors, it could overload things.

Feb 12 '22 17:02 mcabbott

Traversing it won't cause any issues, the question is how it should be re-assembled post traversal. Specifically if you run into the same mutable non-leaf multiple times, whether it should be untied as part of the reassembly process (vs fetched from the cache on subsequent occurrences).

The question is whether we're ok with breaking the following behaviour: given x′ = fmap(identity, x), any nodes in x that are === should also be so in x′. Maybe this is a reasonable thing to not care about, but if so it ought to be documented.

My NamedDimsArray example is my complaint about ismutable, distilled. Could be OffsetArray, too. This ought to work just like an Array.

My thought for this was to lean on parent for array wrappers. viz.

function usecache(x::AbstractArray)
  p = parent(x)
  p !== x && return usecache(p) # alt. typeof(p) !== typeof(x), etc. if we wanted type stability or to avoid potentially expensive `===` methods.
  return ismutable(x) # fallback 
end

usecache(x::Array) = true
...

Feb 12 '22 18:02 ToucheSir

I guess " any nodes in x that are === should also be so in x′." is a simple-to-explain policy. Would be nice if the policy for how often f will be called was as simple. Will think a bit.

It seems a bit fragile to depend on the right parent methods existing (in one case) and not existing (in the other), since nothing requires them.

Feb 13 '22 01:02 mcabbott

What wrapper types don't expose a parent method? Worst case they are considered not cacheable due to the fallback in https://github.com/JuliaLang/julia/blob/master/base/abstractarray.jl#L1398.

Would be nice if the policy for how often f will be called was as simple. Will think a bit.

Aye, this is much easier in a purer functional language where you're only traversing over trees. Sometimes I'm tempted to try representing the object graph as an actual digraph for this reason, but that's a little far off the deep end :P

Feb 13 '22 01:02 ToucheSir

Now updated with a different rule:

non-leaf objects are cached only if they are mutable (as immutable structs can be perfectly reconstructed).
leaf objects are cached if they contain any mutable objects (by recursing fieldnames(x) etc, nothing special to arrays).

Weird cases:

If you use exclude to have f act on some non-leaf types, which are immutable but contain mutable objects, then it will not cache the result. Should it?
If an immutable struct has fields which are not children, it still won't be cached. However, the reconstruction will re-use the same original non-child, that's OK.

Sep 24 '22 14:09 mcabbott

Am going to merge this so that master has the new behaviour, but won't rush to tag it.

The code here will be changed by #43, but the tests may (I think) survive.

Oct 19 '22 23:10 mcabbott

Functors.jl Functors.jl copied to clipboard

Use the cache less often

Functors.jl
Functors.jl copied to clipboard