Accessors.jl icon indicating copy to clipboard operation
Accessors.jl copied to clipboard

get/set queries

Open rafaqz opened this issue 3 years ago • 55 comments

This PR adds get/set queries as discussed in https://github.com/JuliaObjects/Accessors.jl/issues/9, returning a Tuple of values from an object, or setting object values from a Tuple

Currently you can select objects by a select function, and choose where to descend with a descend function.

For macros I was thinking something like this could be nice:

@getall x isa Int
@setall x isa AbstractArray = arrays

TODO:

  • [x] organise to use set/modify as other optics/lenses do
  • [x] compose with other optics/lenses
  • [x] write macros
  • [x] context queries, where the returned value can be some function of obj/fieldname/value
  • [ ] tree context queries, where you can build a tree structure of e.g. fieldnames that lead to the queries objects
  • [x] include Properties/Elements distinction, and maybe add Fields? this is using fieldnames in a @generated function

Notes:

  • After rewriting to match the style here ~~it seems difficult to match the performance of a standalone recursive generated function. Might get there with some work~~. The performance is fine. There is one type stabiulity issue left checking values have been changed to avoid using constructorof - typeof val/state pairs are needed to track if a change occurred. Otherewise its a lot faster now when the types are unstable.
  • Performance with fieldnames in @generated with the Fields optic I added is 10-100x better than Properties. But it may need some additions to ConstructionBase.jl to be legitimate. Flatten.jl has always done this, but it predates ConstructionBase.jl, and it's a bit dodgy. We could add fieldconstructor and propertyconstructor that both default to constructorof, in case they need to be different.

rafaqz avatar Jun 06 '21 06:06 rafaqz

Thanks! It was awful before I rewrote it with the structure here.

I'm not sure how to do composition properly yet, it needs more thinking. That's the part I understand the least about Accessors.jl so far... but getting there.

rafaqz avatar Jun 07 '21 08:06 rafaqz

The iterator is broken too, I knew it couldn't be that easy lol...

Fixed: but things are getting a bit less elegant - the iterator has to be passed around everywhere. Probably the end result of this is that things will mostly look clear like you say, but mapfields will be hell, with lots of tuple return value manipulation and special casing.

rafaqz avatar Jun 07 '21 08:06 rafaqz

This is composing now. It hast to compose internally, as you want to compose with whatever fields are selected by Query, not with its return value.

rafaqz avatar Jun 07 '21 15:06 rafaqz

This is composing now. It hast to compose internally, as you want to compose with whatever fields are selected by Query, not with its return value.

Can you give a few examples of what Query should do and how you think it should compose? I toyed a bit with this branch and found that things often do not behave as I anticipated. Now I realize that this is very much WIP so I am not sure, whether I don't understand the intended semantics of Query or whether these are just bugs / not really implemented.

For instance

julia> using Accessors

julia> obj = (a=missing, b=1, c=(d=missing, e=(f=missing, g=2)))
(a = missing, b = 1, c = (d = missing, e = (f = missing, g = 2)))

julia> Query(ismissing)(obj)
(missing, 1, missing, missing, 2)

I would have expected (missing,missing,missing) instead.

Also are querries supposed to be lenses? That is should q::Query satisfy the lens laws:

@assert q(set(obj, q, val)) ≅ val
        # You get what you set.
@assert set(obj, q, q(obj)) ≅ obj
        # Setting what was already there changes nothing.
@assert set(set(obj, q, val1), q, val2) ≅ set(obj, q, val2)
        # The last set wins.

jw3126 avatar Jun 07 '21 16:06 jw3126

Yeah that's just a bug, the answer should be what you expect. I need to write a few more tests.

It should follow the lens laws... notice modify just uses map over the results of get to feed into set: https://github.com/rafaqz/Accessors.jl/blob/e6e851b67318daac07647adcdab8714ad47627be/src/optics.jl#L530

The few tests I had just randomly work as expected :laughing:

rafaqz avatar Jun 07 '21 23:06 rafaqz

Ok this version should be working for lens laws, but with some problems with set for Properties and bad set performance.

rafaqz avatar Jun 08 '21 00:06 rafaqz

I just realized, that querries are not lenses in general.

@assert q(set(obj, q, val)) ≅ val
        # You get what you set.
@assert set(obj, q, q(obj)) ≅ obj
        # Setting what was already there changes nothing.
@assert set(set(obj, q, val1), q, val2) ≅ set(obj, q, val2)
        # The last set wins.

I believe the first and second lens law should always hold, but the third may not:

julia> using Accessors

julia> obj = (a=missing, b=1, c=(d=missing, e=(f=missing, g=2)))
(a = missing, b = 1, c = (d = missing, e = (f = missing, g = 2)))

julia> q = Query(ismissing);

julia> obj2 = set(obj, q, ["first", "wins", "here"])

julia> set(obj2, q, ["second", "should", "win"])

Not a huge deal, but we should be careful about using lens as a variable name.

jw3126 avatar Jun 08 '21 08:06 jw3126

Yeah... the query would have to be updated for that to hold, e.g:

julia> set(obj2, Query(x -> x isa String), ["second", "should", "win"])
(a = "second", b = 1, c = (d = "should", e = (f = "win", g = 2)))

So it breaks 3. when the the values passed to set would give a different query result to the values in the object that they replace. We just need to give that a name I guess.

Formally (for tuples at least) maybe its:

islens = query(values) == values

Also let me know what you think of the macros (theyre in the tests now too). They seem nice for simple use cases, just using the Query object is probably easier beyond something like this:

julia> missings_obj = (a=missing, b=1, c=(d=missing, e=(f=missing, g=2)))
(a = missing, b = 1, c = (d = missing, e = (f = missing, g = 2)))

julia> @getall missings_obj isa Number
(1, 2)

julia> @setall ismissing(missings_obj) = (1.0, 2.0, 3.0)
(a = 1.0, b = 1, c = (d = 2.0, e = (f = 3.0, g = 2)))

Edit: it feels like these examples should also work for an iterator like a single number or Ref, although that breaks lens laws as well.

rafaqz avatar Jun 08 '21 08:06 rafaqz

How do you feel about a dependency on Static.jl?

Using StaticInt as an iterator solves the type stability with set.

rafaqz avatar Jun 08 '21 10:06 rafaqz

How do you feel about a dependency on Static.jl?

Using StaticInt as an iterator solves the type stability with set.

Static.jl looks like a small dependency, so I am fine with it.

jw3126 avatar Jun 08 '21 13:06 jw3126

Turns out Static.jl doesn't actually fix it, I'm just getting Revise.jl artifacts from this Julia bug: https://github.com/JuliaLang/julia/issues/35800

We'll have to settle for 20-100ns or so timings until that gets fixed. I can get everything under 2ns using Revise but it wont stick on restart, which has been driving me crazy.

rafaqz avatar Jun 08 '21 13:06 rafaqz

I think I have an alternative approach how this can be implemented. Let me first sketch a slow implementation, analyze why it is slow and sketch how to fix it.

struct Query{S,D,O}
    select::S
    descend_condition::D
    optic::O
end

function modify(f, obj, q)
    modify(obj, q.optic) do o
        if q.select_condition(o)
            f(o)
        elseif q.descent_condition(o)
            modify(f, o, q)
        else
            o
        end
    end
end

pop(x) = first(x), Base.tail(x)
push(x, val) = (x..., val)

function setall(obj, q, vals)
    modify(obj, q) do o
        val, vals=pop(vals)
        val
    end
end

function getall(obj, q)
    ret = ()
    modify(obj,q) do o
        push(ret, o)
        o
    end
    return ret
end

Now this is slow, because we change captured variables in a closure. How to fix this? By passing state around explicitly. So we have to replace modify as a primitive by a state aware modify_stateful.

function getall(obj, q)
    initial_state = ()
    _, final_state = modify_stateful((obj, initial_state), q) do o, state
        new_state = push(state, o)
        o, new_state
    end
    return final_state
end

"""

    new_obj, new_state = modify_stateful(f, (obj,state), optic)

Here `f` has signature `f(::Value, ::State) -> Tuple{NewValue, NewState}`.
"""
function modify_stateful end

function modify_stateful(f, (obj, state1), optic::Properties)
    # @generated function that produces the following code:
    nt = getproperties(obj)
    val1, state2 = modify_stateful(f, (nt.field1, state1), optic)
    val2, state3 = modify_stateful(f, (nt.field2, state2), optic)
    val3, state4 = modify_stateful(f, (nt.field3, state3), optic)
    val4, state5 = modify_stateful(f, (nt.field4, state4), optic)
    patch = (field1=val1, field2=val2...)
    new_obj = setproperties(obj, patch)
    new_state = state5
    return (new_obj, new_state)
end

function modify_stateful(f, (obj, state), q::Query)
    modify_stateful((obj,state), q.optic) do o,s
        if q.select_condition(o)
            f(o,s)
        elseif q.descent_condition(o)
            modify_stateful(f, (o,s), q)
        else
            o, s
        end
    end
end

function modify_stateful(f, (obj, state), ::ComposedOptic)
    ...
end

# all currently existing modify methods in Accessors.jl can be rewritten into `modify_stateful` in a straight forward fashion.
# Like with modify we want to use trait based dispatch `_modify_stateful(f, obj_state, optic, ::OpticStyle)`.
# `modify` may in the end be reimplemented in terms of `modify_stateful` if the compiler likes it.

What do you think?

jw3126 avatar Jun 09 '21 04:06 jw3126

Wow that looks like a flexible generalisation, lets use it. Thanks for putting the time in to rethink this.

rafaqz avatar Jun 09 '21 04:06 rafaqz

Also let me know what you think of the macros (theyre in the tests now too). They seem nice for simple use cases, just using the Query object is probably easier beyond something like this:

I wonder if we can capture generator syntax here to make the macros a bit more flexible.

@getall (x for x in obj if select(x))
@getall Float64[x for x in obj if select(x)] # collects into a vector of Float64 
@modify CuArray(x) for x in obj if x isa AbstractArray # move everything to GPU
@getall (x for x in obj |> Elements()) # customize the optic to flatten a nested tuple

jw3126 avatar Jun 09 '21 05:06 jw3126

This is nice syntax:

@getall Float64[x for x in obj if select(x)] # collects into a vector of Float64 

You could also use lenses:

@getall [x.a for x in obj if x isa NamedTuple]
@setall (x[2] for x in obj if x isa AbstractArray) = (1, 2 ,3)

rafaqz avatar Jun 09 '21 08:06 rafaqz

Not for this PR, but for a later stage I think we can connect this with Transducers. The idea is to turn (obj, optic) into a ReducibleCollection. Roughly speaking by:

function Transducers.foldl(rf, ReducibleCollection(obj, optic), init)
    # TODO this is the wrong method to overload, but you get the idea
    _, ret = modify_stateful((obj, init), optic) do o, state
        new_state = rf(state, o)
        o, new_state
    end
    return ret
end

We would have:

getall(obj, q) = foldl(push!!, ReducibleCollection(obj, q))

Benefits would be that we could directly apply reduction functions to queries and Transducers could take care of the output container (e.g. Tuple, Vector, lazy iterator...).

jw3126 avatar Jun 09 '21 08:06 jw3126

Transducers interop sounds good for a future PR. I probably have to leave that to you as I haven't had time to get my head around transducers yet.

rafaqz avatar Jun 09 '21 08:06 rafaqz

You could also use lenses:

@getall [x.a for x in obj if x isa NamedTuple]
@setall (x[2] for x in obj if x isa AbstractArray) = (1, 2 ,3)

Ah nice, this is a way to deal with simple "contextful" queries.

jw3126 avatar Jun 09 '21 09:06 jw3126

The context that is harder to get is the parent object and field name (probably as a StaticSymbol/StaticInt). Fieldname is really useful for making name => value pairs e.g widget labels or checking you are querying the right fields. We can work it into state somehow.

This was fairly straightforward but not finished:

julia> missings_obj = (a=missing, b=1, c=(d=missing, e=(f=missing, g=2)))
(a = missing, b = 1, c = (d = missing, e = (f = missing, g = 2)))

julia> @getall (x.e for x in missings_obj if x isa NamedTuple)
((f = missing, g = 2),)

rafaqz avatar Jun 09 '21 12:06 rafaqz

The context that is harder to get is the parent object and field name (probably as a StaticSymbol/StaticInt). Fieldname is really useful for making name => value pairs e.g widget labels or checking you are querying the right fields. We can work it into state somehow.

I totally agree fieldnames are so useful. More general context is tough and IMO out of scope for this PR. But as you say the way to go piggy bag context into state. Probably via some PushCtx() mechanism like what I tried to sketch here.

jw3126 avatar Jun 09 '21 15:06 jw3126

I've implemented modify_stateful and built getall, setall and context methods on it generally following the ideas from this comment. The structure seems pretty clean, although the generated function is still a bit of a mess. State wrappers like ConstextState, GetAllState and SetAllState allow dispatching behaviours for each type and also wrapping state objects.

Unfortunately it's harder to optimise now, I've got some lingering type instabilities. Maybe you can see whats causing them.

rafaqz avatar Jun 19 '21 14:06 rafaqz

@jw3126 sorry I haven't had time for this in a while, just revisiting now.

How do you feel about these options:

  1. finish this with the current type instability
  2. reduce the scope to only matching types (so all at compile time) and separate out more generated functions to get type stability

rafaqz avatar Jul 19 '21 07:07 rafaqz

@jw3126 sorry I haven't had time for this in a while, just revisiting now.

How do you feel about these options:

1. finish this with the current type instability

2. reduce the scope to only matching types (so all at compile time) and separate out more generated functions to get type stability

How urgent is this for you? One option would be to just wait until either inference can handle this out of the box or compiler plugins are stable and powerful enough that we can help inference. If you don't want to wait that long, I am fine with 1. About 2. my concern is that implementation will be complicated and composability will be a problem.

jw3126 avatar Jul 19 '21 09:07 jw3126

Ok maybe we can do 1 and fix the performance things when we can.

The (low key) urgency is that I'm keen to move on from using Flatten.jl in a few packages. The lack of checks on state change put a ceiling on its use when there are fields in an object that don't have constructorof defined, and if I'm adding functionality like that I'd rather it was here than in Flatten.jl.

rafaqz avatar Jul 19 '21 10:07 rafaqz

Ok maybe we can do 1 and fix the performance things when we can.

Sounds good!

jw3126 avatar Jul 19 '21 10:07 jw3126

Ok this is pretty much good to go. The last question is what to do with the context method and how to write @modify in relation to the generator syntax of the @setall and @getall macros.

rafaqz avatar Aug 02 '21 12:08 rafaqz

Any updates on this?

bgroenks96 avatar Feb 11 '22 13:02 bgroenks96

I fear with the current compiler these type instabilities cannot be fixed. My feeling is the only way would be compiler plugins, but there is no mature way to do that. The other option is to merge a type unstable variant. Would that be useful to people?

jw3126 avatar Feb 11 '22 13:02 jw3126

Is it type unstable in all cases or just in specific cases?

bgroenks96 avatar Feb 11 '22 13:02 bgroenks96

IIRC queries of moderate complexity are not type stable, trivial queries work?

jw3126 avatar Feb 11 '22 14:02 jw3126