Tullio.jl
Tullio.jl copied to clipboard
Can't use a finalizer with scalar output
Not really a bug per se, but is there a reason why not? I was pretty surprised by this.
ERROR: LoadError: LoadError: LoadError: "can't use a finaliser |> with scalar output"
Stacktrace:
[1] parse_input(expr::Any, store::Any)
@ Tullio ~/.julia/packages/Tullio/IHd6P/src/macro.jl:307
[2] _tullio(exs::Any; mod::Any)
@ Tullio ~/.julia/packages/Tullio/IHd6P/src/macro.jl:90
[3] var"@tullio"(__source__::LineNumberNode, __module__::Module, exs::Vararg{Any, N} where N)
@ Tullio ~/.julia/packages/Tullio/IHd6P/src/macro.jl:35
[4] include(x::String)
@ Bombe ~/.julia/dev/Bombe/src/Bombe.jl:1
[5] top-level scope
@ ~/.julia/dev/Bombe/src/Bombe.jl:9
[6] top-level scope (repeats 2 times)
@ none:1
Since it's mildly convenient to take an average by summing and then finalizing with |> _ / n.
The logic is roughly that @tullio z := x[i] |> f can equally well be z = (@tullio _ := x[i]) |> f, since it produces just one number. While for an array, @tullio y[i] := exp <| log(A[i,j]) is able to avoid allocating an intermediate array. So it didn't seem worth the extra complexity of handling this in the scalar case.
That said, it's not easy to find examples where the efficiency gain is large; the array you save is often O(N) next to O(N^2) things. So the whole feature may be more a convenience notation than a performance one.
The logic is roughly that
@tullio z := x[i] |> fcan equally well bez = (@tullio _ := x[i]) |> f, since it produces just one number. While for an array,@tullio y[i] := exp <| log(A[i,j])is able to avoid allocating an intermediate array. So it didn't seem worth the extra complexity of handling this in the scalar case.That said, it's not easy to find examples where the efficiency gain is large; the array you save is often O(N) next to O(N^2) things. So the whole feature may be more a convenience notation than a performance one.
Any reason not to just simplify down to z = (@tullio _ := x[i]) |> f inside the macro when a finalizer is called on a scalar, then? I might be dramatically underestimating the complexity, but this feels like the kind of thing that should be doable in 2 or 3 lines.
I'll look when next I'm down in there. There's some terrifying code I meant to straighten out for deciding if it can differentiate y[i] := f <| g(A[i,j]) without having stored g.(A), which is tangled up with how it stores |> _/n between parsing it and applying it. And the scalar case has a fairly separate output path, including a different multi-threading scheme.