Kyurae Kim
Kyurae Kim
Seems like this error is persisting on Zygote `0.6.63`. For those who need a quick workaround, you do the following: ```julia gradient(y -> sum(x -> x^2, y), A) ```
@torfjelde This issue is still persisting; any suggestions on how we should deal with this? Maybe just change the `Stacked` bijector implementation so that we don't hit this edge-case at...
> Does such a fast-path exist? Oh sorry, I meant `reduce(vcat)`. For this, I'm quoting @mcabbott 's reply: > note also that `reduce(vcat, xs; init)` and `mapreduce(f, vcat, xs)` are...
Finally fixed in [[email protected]](https://github.com/TuringLang/Bijectors.jl/pull/315) !
@torfjelde Is this issue still relevant?
More information would be helpful: Are you using `Zygote` as the backend?
Can you try `Zygote` and see if the discrepancy is smaller? `Flux` tends to be optimized towards `Zygote` so..
This appears to be more complicated. It seems that `gradient(y -> sum(x -> x^2, y)/10, CUDA.randn(10))` does not hit the `sum(f, x)` [rrule](https://github.com/JuliaDiff/ChainRules.jl/blob/3c93eb6f462efb7f8f6c1c5f212fcc3fedc4e1db/src/rulesets/Base/mapreduce.jl#L76C1-L129C4), while `mean(f, x)` does. This is super...
Hi all, would it be possible to get this moving? I think it would be really great to have this feature!
Hi @devmotion, could we make this happen?