Anton Smirnov

Results 213 comments of Anton Smirnov

I guess it fails, because `Test.@inferred` errors on the first non-inferrable function and `ChainRulesTestUtils` [doesn't do](https://github.com/JuliaDiff/ChainRulesTestUtils.jl/blob/dd0e246d1516451ec963db55a7a51910819082b3/src/testers.jl#L255) any special handling in this case. `test_rrule()` probably needs to wrap `@inferred` in a...

Inputs are already quite small 128x128, batch size is set to 1 (for these timings) as well. Timings for (in the same session in that order): ```julia @time train_loss(model, x,...

Tried running on 0.6.29, but hitting this error: ``` ERROR: LoadError: Need an adjoint for constructor Base.Iterators.Enumerate{Vector{Int64}}. ``` I guess code relies on #785 as I had to use `map`,...

> Worth trying with ideas from #1126, the simplest of which is to run this before your model: > > ``` > @eval Flux (c::Chain)(x) = foldl((y,f) -> f(y), (x,...

> What are the times for the resnet model on CPU? I think comparing compilation latency between CPU and GPU forward > passes would be the easiest way to start...

Here's also `ProfileView` for the whole ResNet `tinf` flamegraph (just in case). Everything in red is mostly Zygote related. ![resnet-gpu-def](https://user-images.githubusercontent.com/17990405/148306632-b1097936-ecf4-44bb-b0d3-cea35ed56852.png)

> Also, those forward times are pretty eye-watering! These are the timings for the very first run on a cold start. Subsequent ones are fast: Here's forward for the GPU:...

@torfjelde for my case (NN using Flux.jl), compile times can be significantly improved by fixing type-stability issues with layers. See https://github.com/FluxML/NNlib.jl/pull/370 for timings, mainly first forward pass. But it doesn't...

https://github.com/JuliaGPU/KernelAbstractions.jl/issues/298 seems to be related issue.

@vchuravy here's the output: [error.txt](https://github.com/EnzymeAD/Enzyme.jl/files/8928691/error.txt) It is for when `device = CUDADevice()`