Matthew Schlegel

Results 45 comments of Matthew Schlegel

First steps to block weights in rnns #1855 . Would really like some thoughts before I move forward on this for the other cells. This impl works with cudnn (I've...

Errors are in utils tests. Maybe I should comment those out to just have recurrent tests?

Something else to think about. When using CUDNN the weight matrix is not organized like it is in the cpu implementation. I have yet to think of a good solution...

Sorry. Dumping more info/thoughts here. There is a lot of metadata required to pass to cudnn. I'm wondering if we should have a flux side metadata struct to deal with...

Updated to use efficient view adjoint based on multigate. I also changed some things to better represent Flux's current state for recurrent layers. I was behind on some things.

I just realized this new version requires the use of a bias unit. I'm not sure how to solve that w/o a flag of some kind.

I dealt with the bias issue by dispatching on constructors and the `rnn_weights` function. So now we support no bias with `initb=Flux.Zeros`. I also added a test case for this...

Good point! I do not have a benchmark comparing Flux RNNs/GRUs/LSTMs to cudnn. But [this white paper](https://arxiv.org/pdf/1806.01818.pdf) does a good comparison over a variety of kernels. What I've found consistently...

Was perusing the new Optimisers.jl stateless approach, and was wondering if we might want something similar to that for recurrent cells. This would also be similar to how flax works,...

Was a pretty easy interface to implement in Fluxperimental. Pull request here: https://github.com/FluxML/Fluxperimental.jl/pull/5. Once we are happy with the basic interface we can start fleshing out the functionality to stateful...