ChainRules.jl icon indicating copy to clipboard operation
ChainRules.jl copied to clipboard

Audit performance of captured variables in closures

Open ararslan opened this issue 6 years ago • 9 comments

One of the benefits of ChainRules' design is that rules for multiple arguments can share intermediate computations by virtue of defining variables outside of the individual Rules then capturing them in the wrapped closures. However, this approach likely incurs the infamous https://github.com/JuliaLang/julia/issues/15276. To get around this, we could potentially define a macro that does a Rule definition but scans the closure expression for uses of variables not defined therein, and wrapping those in let. That would be heinously hacky but might buy us some performance improvements.

ararslan avatar Jun 20 '19 18:06 ararslan

I have had a few times this kinda crazy idea of Rules not wrapping closures, but instead being there own singleton functors, that are a lot like closures.

I am pretty sure you can pull that off fairly easily if your willing to use a macro to declare them programatically. And it would remove a layer of pointer indirection.

oxinabox avatar Jun 20 '19 18:06 oxinabox

We could also use https://github.com/c42f/FastClosures.jl, which I believe does the hack I described above, but is a registered, maintained(?) package. It does not support do blocks though, which I have been using judiciously in the stuff I've added.

ararslan avatar Jun 20 '19 18:06 ararslan

but instead being there own singleton functors, that are a lot like closures.

That's interesting. So they would work kind of like how functions work in Julia, where they all subtype a common supertype (like Function) but each have their own concrete type? Then instead of defining closures inside of Rules, you'd define a constructor for the particular rule type? That's an interesting idea, but if my understanding is correct, you'd still be capturing outer variables in the constructor body. (Not sure whether that incurs the same penalty as the linked issue though.)

ararslan avatar Jun 20 '19 18:06 ararslan

I believe the problem is boxing.

julia> function foo()
       x=2
       g()=x+=1
       h()=x-=1
       g,h
       end
foo (generic function with 2 methods)

julia> g1,h1 = foo()
(getfield(Main, Symbol("#g#25"))(Core.Box(2)), getfield(Main, Symbol("#h#26"))(Cor
e.Box(2)))

julia> G = typeof(g1)
getfield(Main, Symbol("#g#25"))

julia> G.mutable
false

julia> g1()
3

julia> g1() |> typeof
Int64

julia> g1.x
Core.Box(4)

julia> (g1.x |> typeof).mutable
true

So when we make our Rule functor types, we just do not allow them to box (unlike closures which can). That will make certain things impossible (e.g. that foo thing), but will ensure that it is fast.

oxinabox avatar Jun 20 '19 18:06 oxinabox

To some approximation:

macro Rule(expr)
    # Add a rule cache check here maybe.

    closes_over =  vars_scan(expr)

    sig, body = sigbody_split_function(expr)
    neobody = replace(body, Dict(var=>Expr(:., rule, var) for var in closes_over))

    functorname $(gensym(:rule))
    typeparams = Symbol.(uppercase.(string.(vars))) 
    quote
        @eval(  # got to run at global scope
            struct $(functorname){$typeparams} <: AbstractRule 
                $(zip(Expr(:(::), closes_over, typeparams)...
            end

            function (rule::$functorname)($sig)
                $neobody
            end
        )

        functorname(closes_over)
    end
end

oxinabox avatar Jun 20 '19 18:06 oxinabox

As an aside, the @eval shouldn't be necessary; instead, the returned expression could be :toplevel.

ararslan avatar Jun 20 '19 19:06 ararslan

We could also use https://github.com/c42f/FastClosures.jl, which I believe does the hack I described above, but is a registered, maintained(?) package

Semi-maintained - I rather dislike that package and I wish it would go away ;-) Seeing this comment has caused me to release a new version with do support though: https://github.com/JuliaRegistries/General/pull/1934

Regardless of that, your macro may be better because the let trick doesn't always work.

c42f avatar Jul 10 '19 09:07 c42f

The macro i posted is insane though, and needs some complicated code written to mangle function bodies into shape. (I thinking Mocking.jl might actually have that code, randomly)

oxinabox avatar Jul 10 '19 09:07 oxinabox

Actually I didn't look at the details of your macro. All I meant to say is if there's another more predictable way to do this it might be preferable to the let trick.

The let trick itself needs some horrible re-implementation of parts of lowering to work (and it's still a bit wonky TBH, it's a hack).

c42f avatar Jul 10 '19 09:07 c42f