Natalia Gimelshein comments

Results 89 comments of


Natalia Gimelshein

BLSTM with dropout ~= 0 produces non-deterministic results

Thanks for reporting the issue! We have discovered that there are problems with dropout application in cudnn (non-determinism that you've discovered, and issues in the weight update), and are looking...

BLSTM with dropout ~= 0 produces non-deterministic results

Yes, we are applying dropout to input of each layer.

cudnnConvolutionBackwardData failed - Error in CuDNN: CUDNN_STATUS_NOT_SUPPORTED (cudnnConvolutionBackwardData)

Try limiting your workspace size by setting cudnn.maxWorkspaceGPUMemPercent (say, to 30 or 40)

cudnn.convert() in combination with nngraph

You should be moving to pytorch, torch is no longer supported.

Zero-Masking with RNN/LSTM

cudnn RNN/LSTM accepts inputs with the different sequence length and thus does not require padding. The requirement is that inputs be sorted in the descending order of sequence length. This...

Zero-Masking with RNN/LSTM

From the manual:-) In cudnnRNNForwardTraining entry "The first dimension of the tensors may decrease from element n to element n+1 but may not increase."

Zero-Masking with RNN/LSTM

Sequences with the different length can already be grouped into a batch without padding, cudnn supports that. Torch bindings don't, at the moment.

Zero-Masking with RNN/LSTM

Look at variable length sequences test for an example of how it can be done https://github.com/soumith/cudnn.torch/blob/master/test/test_rnn.lua#L324

Dropout

cudnn.Dropout:setp() will be broken (for it to work, cudnn.Dropout:setp() should call setDropoutDescriptor using NULL as state argument if states are initialized already). (and thanks for these bindings!)

Dropout

There is a potential problem with several dropout layers sharing the states and running concurrently in the different streams, but its hard to make this scenario completely safe.