Michael Abbott
Michael Abbott
Worth trying with ideas from #1126, the simplest of which is to run this before your model: ``` @eval Flux (c::Chain)(x) = foldl((y,f) -> f(y), (x, c.layers...)) ```
Another random idea if trying things is https://github.com/JuliaLang/julia/pull/43370 (with Julia 1.8)
> What has helped is the use of -O1 optimization flag. We could consider setting this for the package, like so: https://github.com/JuliaPlots/Plots.jl/pull/2544/files Assuming it had the same good effect, and...
Does anything in IRTools run at runtime? That might also be a good candidate for being `-O1`. And `@max_methods 1` alla JuliaLang/julia#43370.
`Base.Experimental.@optlevel 0` seems to help and not hurt this Flux example: https://github.com/FluxML/Zygote.jl/issues/1126#issuecomment-1009198264 . I saw little effect from adding it only to IRTools.
#1126 also points to Julia 1.6 (released on 24 March 2021) being much slower than 1.5 in this regard.
Benchmarks: * Some here: https://github.com/FluxML/Zygote.jl/issues/994#issuecomment-861065316 * From #962 ``` julia> f4(x) = sum([x[i]^2 for i in eachindex(x)]); julia> @btime Zygote.gradient(f4, $(collect(1:1000))) 749.750 μs (7068 allocations: 15.77 MiB) # v0.6.11 455.959...
I didn't think about this since. Except to realise that https://github.com/bkamins/ReadOnlyArrays.jl might be better than the version I wrote here. The checks I wrote for `function (s::ZBack)(dy)` try to handle...
A narrower idea is to make in-place accumulation work only for the result of scalar indexing: ```julia function accum(x::OneElement{T,N}, ys::OneElement{T,N}...) where {T,N} z = Buffer(x) fill!(z.data, zero(T)) z[x.ind...] = x.val...
This used to go here: https://github.com/FluxML/Zygote.jl/blob/v0.6.12/src/lib/array.jl#L299 After #990 and #1004 it goes here, which calls the adjoint for broadcasting: https://github.com/FluxML/Zygote.jl/blob/master/src/lib/broadcast.jl#L278-L283 And that won't work, because broadcasting doesn't handle complex CuArrays...