juice
juice copied to clipboard
Implement dropout
Should be pretty straight forward, warmup for #10
- [x] expand the cudnn bindings in
rcudnn
- [x] use the
rcudnn
bindings incoaster-nn
- [x] create a apropriate interface in
coaster
- [x] use that interface to define a layer in
juice
- [ ] implement tests
Paper: http://www.cs.toronto.edu/~rsalakhu/papers/srivastava14a.pdf
Why backprop of it commented out?
If I read the paper correctly, the backpropagation is just a unit factor which can be skipped. I am on my phone so I cannot review the code right now, the backprop will skip all non existent elements during backprop which enables a good speedup IIRC.
Actually that is incorrect, backprop should only propagate back on the thinned network (section 5.1 of the linked paper) so unless the weights are zero, backprop may not be skipped
Reviewing the paper, the thinned paper essentially is setting the gradient to zero which is easily done. The normalization should be realized by means of an additional mechanic or variation parameter which can be introduced in a separate PR.