Natalia Gimelshein

Results 89 comments of Natalia Gimelshein

Thanks for reporting the issue! We have discovered that there are problems with dropout application in cudnn (non-determinism that you've discovered, and issues in the weight update), and are looking...

Yes, we are applying dropout to input of each layer.

Try limiting your workspace size by setting cudnn.maxWorkspaceGPUMemPercent (say, to 30 or 40)

You should be moving to pytorch, torch is no longer supported.

cudnn RNN/LSTM accepts inputs with the different sequence length and thus does not require padding. The requirement is that inputs be sorted in the descending order of sequence length. This...

From the manual:-) In cudnnRNNForwardTraining entry "The first dimension of the tensors may decrease from element n to element n+1 but may not increase."

Sequences with the different length can already be grouped into a batch without padding, cudnn supports that. Torch bindings don't, at the moment.

Look at variable length sequences test for an example of how it can be done https://github.com/soumith/cudnn.torch/blob/master/test/test_rnn.lua#L324

cudnn.Dropout:setp() will be broken (for it to work, cudnn.Dropout:setp() should call setDropoutDescriptor using NULL as state argument if states are initialized already). (and thanks for these bindings!)

There is a potential problem with several dropout layers sharing the states and running concurrently in the different streams, but its hard to make this scenario completely safe.