Accessors.jl icon indicating copy to clipboard operation
Accessors.jl copied to clipboard

Traversal order of `Recursive`

Open jariji opened this issue 1 year ago • 11 comments

Are there any guarantees on the traversal order of Recursive?

jariji avatar Mar 14 '24 23:03 jariji

I think the only achievable order is determined by whatever the passed optic argument does. Not sure if anything else is realistically possible given the modify interface... Maybe explicitly stating it in the docs would be nice. Do you have an example where there are several reasonably possible orders?

Btw, using modify with order-dependent f isn't exactly best practice.

aplavin avatar Mar 15 '24 00:03 aplavin

I think the only achievable order is determined by whatever the passed optic argument does.

That sounds good. Do Elements() and Properties() have a defined order?

Do you have an example where there are several reasonably possible orders?

I feel this interface is related to Transducers which makes me think it's going to eventually become parallel so order is going to become indeterminate for code like this or similar.

julia> let i = 0
           modify((;a=10, b=20), @o _ |> Elements()) do x
               i += 1
           end
       end
(a = 1, b = 2)

jariji avatar Mar 15 '24 00:03 jariji

It is not documented, but I agree with what @aplavin said:

  • I consider the order or even the number of calls to f an implementation detail. Would be good to document though.
  • If we ever want to parallelize, that would be opt in, like it is in Transducers. At least in my use cases Accessors usually does such tiny computations that splitting them between threads etc would be slower. And if parallel is opt in, we can impose whatever extra requirements we want about f.

jw3126 avatar Mar 15 '24 07:03 jw3126

@jariji if you have some specific usecases in mind, that would benefit from well-defined traversal order or from parallelization, I'd be curious to hear about them :)

Also, for more traversal options, see RecursiveOfType from AccessorsExtra:

julia> obj = (a=1, b=(c=2, d=3))

# don't step into matching elements:
julia> modify(Dict∘pairs, obj, RecursiveOfType(NamedTuple))
Dict{Symbol, Any} with 2 entries:
  :a => 1
  :b => (c = 2, d = 3)

# step into matches as well:
julia> modify(Dict∘pairs, obj, RecursiveOfType(NamedTuple, order=:pre))  # here, :post would be the same
Dict{Symbol, Any} with 2 entries:
  :a => 1
  :b => Dict(:d=>3, :c=>2)

aplavin avatar Mar 30 '24 07:03 aplavin

if you have some specific usecases in mind, that would benefit from well-defined traversal order

There are plenty of other ways to do this that don't rely on order, but here's a use case that came to mind. Optim.optimize wants an array, so one way to turn a struct into an array and back is

  • create an empty array a
  • traverse through the struct of initial values, assigning a[i] = x and modifying each value by replacing the struct value with i += 1
  • optimize with a
  • traverse over the struct again, modifying each entry by replacing it with a[i] to put the minimizer for the ithe parameter back into its right place in the struct.

That relies on the traversal going the same way both times.

if you have some specific usecases in mind, that would benefit from ... parallelization

All the parallel monoid operations we often do on vectors are quite tree-shaped. Eg summing up the nodes in a tree, searching an XML/HTML/JSON tree for a pattern.

jariji avatar Mar 30 '24 08:03 jariji

The first (optimization) case is very convenient with getall + setall. Also, see https://github.com/JuliaAPlavin/AccessibleOptimization.jl – a thin wrapper around Optimization.jl that allows specifying parameters as optics. Uses getall + setall underneath, of course.

traversal going the same way both times

Oh, this is a weaker and totally reasonable guarantee to have (in sequential code) – to avoid any potential confusion. That is, "modify() can traverse in any order, but the order remains the same for multiple calls".

Eg summing up the nodes in a tree

Does summing (or another simple operation) really benefits from multithreading? I thought they are really memory-access-limited.

aplavin avatar Mar 30 '24 18:03 aplavin

The first (optimization) case is very convenient with getall + setall.

Alright. getall/setall might want to document that they traverse in the same order as each other.

Oh, this is a weaker and totally reasonable guarantee to have (in sequential code) – to avoid any potential confusion. That is, "modify() can traverse in any order, but the order remains the same for multiple calls".

Good point.

What happens if setall adds new nodes: which properties of the traversal order are preserved: order, distance, etc?

Does summing (or another simple operation) really benefits from multithreading? I thought they are really memory-access-limited.

Maybe. If the cpu is waiting for x.a to arrive it could simultaneously work on x.b.

jariji avatar Mar 30 '24 19:03 jariji

What happens if setall adds new nodes: which properties of the traversal order are preserved: order, distance, etc?

What are "new nodes"? I don't think setall is supposed to add anhything.

aplavin avatar Mar 30 '24 22:03 aplavin

If we have (10, 20) and then setall replaces each one with a new container so that we have (((10,), (100, )), ((20,), (200)), then there are more nodes than before.

Are we guaranteed breadth-first traversal?

jariji avatar Mar 30 '24 22:03 jariji

Wdym by breath-first here, could you give a more complete explanation/code example?

aplavin avatar Apr 01 '24 19:04 aplavin

I'm not sure what I meant....

jariji avatar Apr 01 '24 20:04 jariji