Matthew Schlegel comments

Results 45 comments of


                                            Matthew Schlegel

Recurrent network interface updates/design

First steps to block weights in rnns #1855 . Would really like some thoughts before I move forward on this for the other cells. This impl works with cudnn (I've...

Block parameters for RNNs (for cudnn paths)

Errors are in utils tests. Maybe I should comment those out to just have recurrent tests?

Block parameters for RNNs (for cudnn paths)

Something else to think about. When using CUDNN the weight matrix is not organized like it is in the cpu implementation. I have yet to think of a good solution...

Block parameters for RNNs (for cudnn paths)

Sorry. Dumping more info/thoughts here. There is a lot of metadata required to pass to cudnn. I'm wondering if we should have a flux side metadata struct to deal with...

Block parameters for RNNs (for cudnn paths)

Updated to use efficient view adjoint based on multigate. I also changed some things to better represent Flux's current state for recurrent layers. I was behind on some things.

Block parameters for RNNs (for cudnn paths)

I just realized this new version requires the use of a bias unit. I'm not sure how to solve that w/o a flag of some kind.

Block parameters for RNNs (for cudnn paths)

I dealt with the bias issue by dispatching on constructors and the `rnn_weights` function. So now we support no bias with `initb=Flux.Zeros`. I also added a test case for this...

Block parameters for RNNs (for cudnn paths)

Good point! I do not have a benchmark comparing Flux RNNs/GRUs/LSTMs to cudnn. But [this white paper](https://arxiv.org/pdf/1806.01818.pdf) does a good comparison over a variety of kernels. What I've found consistently...

Recurrent network interface updates/design

Was perusing the new Optimisers.jl stateless approach, and was wondering if we might want something similar to that for recurrent cells. This would also be similar to how flax works,...

Recurrent network interface updates/design

Was a pretty easy interface to implement in Fluxperimental. Pull request here: https://github.com/FluxML/Fluxperimental.jl/pull/5. Once we are happy with the basic interface we can start fleshing out the functionality to stateful...