Kyle Daruwalla comments

Results 404 comments of


                                            Kyle Daruwalla

`using Flux, cuDNN` freezes, but `using Flux, CUDA, cuDNN` works

cuDNN has its own extension that triggers on CUDA + cuDNN. I don't see what would hang in the extension itself, so maybe the partial overlap of CUDA is causing...

Tied weights using Flux layers

We should probably change that line in the Adam code to use Adapt.jl to get the correct type instead of hard-typing the return of `get!`.

Tied weights using Flux layers

I don't think we need the fix in Optimisers.jl because the state is initialized separately (and correctly). This appears to only be a bug for IdDict optimizers. Agreed that we...

Tied weights using Flux layers

> taking 2 steps of adam with separate gradients instead of a single step with the accumulated one Yeah, with ADAM this will certainly be wrong. Referencing https://github.com/FluxML/Zygote.jl/issues/991#issuecomment-864411649, it's not...

doc reference for `gpu_backend!`

Related to #2293 as well

[docs] Highlight `update!` API more to attract DL researchers

FWIW, most of the current maintainers do not like `train!` or the current difficulty finding information in the docs. We discussed both topics in the most recent ML call, and...

Make `loss(f,x,y) == loss(f(x), y)`

A NEWS entry for this feature would be good too

Speed up `show` for gpu models

I think it might be confusing for users. Someone used to `show` on the CPU might interpret the absence of NaN messages as no NaNs. Is it possible to make...

Request: Superscript and Subscript

That subset of letters would also be awesome for programming languages with unicode support. They are the most often used letters for indexing so having super and subscripts would be...

Pullback for `tr` produces a CPU `Diagonal` causing downstream scalar indexing on GPUs

Yeah, that seems to avoid scalar indexing