Mike J Innes
Mike J Innes
Maybe I misunderstand, but I do think the best way right now is to just overload `_pullback` directly, at least in the short term. Happy to help with how to...
I'm on board; I think it'd make sense to add them to NNlib. If there are any volunteers I'd be happy to look at a PR.
That looks great. Any chance you can put it up as a PR in a Literate.jl format? Then we can look at debugging performance.
Using the manifest you can depend on a branch of Metalhead, if you want to try that.
I like the idea!
Is this still WIP? Would be nice to have a readme and some walkthrough in the code. Also probably best not to define a module for a script like this.
Can you give a simple usage example for this, and/or a general idea of how it should be used?
Would be nice to figure out the training issues so that this can get more interesting results before we merge. Also, you shouldn't need to commit the `ipynb` file here...
This is probably coming from the `.+ b` [here](https://github.com/FluxML/Flux.jl/blob/9d56807bcd32461547dc8c4c0b3e8ef90057c2b8/src/layers/conv.jl#L44). During the forward pass `b` gets broadcasted out which means the gradient needs to be collapsed back down again (by summing...
Yeah – the CUDNN wrappers were set up [here](https://github.com/JuliaGPU/CuArrays.jl/pull/145/files), so it just needs someone to set up the right dispatch on the Flux side.