cudnn.torch icon indicating copy to clipboard operation
cudnn.torch copied to clipboard

[R5]Run lua test_rnn.lua error

Open TangXing opened this issue 8 years ago • 4 comments

Running 9 tests 1/9 testBiDirectionalLSTMRNN ............................................ [PASS] 2/9 testRNNTANH ......................................................... [ERROR] 3/9 testBiDirectionalTANHRNN ............................................ [PASS] 4/9 testRNNGRU .......................................................... [ERROR] 5/9 testRNNBatchFirst ................................................... [ERROR] 6/9 testRNNRELU ......................................................... [ERROR] 7/9 testBiDirectionalGRURNN ............................................. [PASS] 8/9 testBiDirectionalRELURNN ............................................ [PASS] 9/9 testRNNLSTM ......................................................... [ERROR]

Completed 22 asserts in 9 tests with 0 failures and 5 errors

testRNNTANH Function call failed unable to convert argument 4 from cdata<enum 6*> to cdata<enum 9*> stack traceback: [C]: in function '?' /home//torch/install/share/lua/5.1/cudnn/init.lua:55: in function 'errcheck' test_rnn.lua:232: in function 'getRNNCheckSums' test_rnn.lua:61: in function <test_rnn.lua:53> [C]: in function 'xpcall' /home/torch/install/share/lua/5.1/torch/Tester.lua:475: in function '_pcall' /home//torch/install/share/lua/5.1/torch/Tester.lua:435: in function '_run' /home//torch/install/share/lua/5.1/torch/Tester.lua:354: in function 'run' test_rnn.lua:316: in main chunk [C]: ?


testRNNGRU Function call failed unable to convert argument 4 from cdata<enum 6*> to cdata<enum 9*> stack traceback: [C]: in function '?' /home/q/torch/install/share/lua/5.1/cudnn/init.lua:55: in function 'errcheck' test_rnn.lua:232: in function 'getRNNCheckSums' test_rnn.lua:97: in function <test_rnn.lua:90> [C]: in function 'xpcall' /home//torch/install/share/lua/5.1/torch/Tester.lua:475: in function '_pcall' /home/torch/install/share/lua/5.1/torch/Tester.lua:435: in function '_run' /home//torch/install/share/lua/5.1/torch/Tester.lua:354: in function 'run' test_rnn.lua:316: in main chunk [C]: ?


testRNNBatchFirst Function call failed unable to convert argument 4 from cdata<enum 6*> to cdata<enum 9*> stack traceback: [C]: in function '?' /home/torch/install/share/lua/5.1/cudnn/init.lua:55: in function 'errcheck' test_rnn.lua:232: in function 'getRNNCheckSums' test_rnn.lua:43: in function <test_rnn.lua:34> [C]: in function 'xpcall' /home//torch/install/share/lua/5.1/torch/Tester.lua:475: in function '_pcall' /home/torch/install/share/lua/5.1/torch/Tester.lua:435: in function '_run' /home/torch/install/share/lua/5.1/torch/Tester.lua:354: in function 'run' test_rnn.lua:316: in main chunk [C]: ?


testRNNRELU Function call failed unable to convert argument 4 from cdata<enum 6*> to cdata<enum 9*> stack traceback: [C]: in function '?' /home//torch/install/share/lua/5.1/cudnn/init.lua:55: in function 'errcheck' test_rnn.lua:232: in function 'getRNNCheckSums' test_rnn.lua:24: in function <test_rnn.lua:16> [C]: in function 'xpcall' /home//torch/install/share/lua/5.1/torch/Tester.lua:475: in function '_pcall' /home/ tang/torch/install/share/lua/5.1/torch/Tester.lua:435: in function '_run' /home/torch/install/share/lua/5.1/torch/Tester.lua:354: in function 'run' test_rnn.lua:316: in main chunk [C]: ?


testRNNLSTM Function call failed unable to convert argument 4 from cdata<enum 6*> to cdata<enum 9*> stack traceback: [C]: in function '?' /home/orch/install/share/lua/5.1/cudnn/init.lua:55: in function 'errcheck' test_rnn.lua:232: in function 'getRNNCheckSums' test_rnn.lua:78: in function <test_rnn.lua:71> [C]: in function 'xpcall' /home//torch/install/share/lua/5.1/torch/Tester.lua:475: in function '_pcall' /home//torch/install/share/lua/5.1/torch/Tester.lua:435: in function '_run' /home/torch/install/share/lua/5.1/torch/Tester.lua:354: in function 'run' test_rnn.lua:316: in main chunk [C]: ?


lua: /home/torch/install/share/lua/5.1/torch/Tester.lua:362: An error was found while running tests! stack traceback: [C]: in function 'assert' /home/torch/install/share/lua/5.1/torch/Tester.lua:362: in function 'run' test_rnn.lua:316: in main chunk [C]: ?

TangXing avatar Apr 27 '16 09:04 TangXing

Can you try applying the following patch and see if it fixes your problem. +cc @SeanNaren

diff --git a/test/test_rnn.lua b/test/test_rnn.lua
index e7ee3de..d8e83e6 100644
--- a/test/test_rnn.lua
+++ b/test/test_rnn.lua
@@ -233,7 +233,7 @@ function getRNNCheckSums(miniBatch, seqLength, hiddenSize, numberOfLayers, numbe
                     linLayerMatDesc[0],
                     minDim,
                     ffi.cast("cudnnDataType_t*", dataType),
-                    ffi.cast("cudnnDataType_t*", format),
+                    ffi.cast("cudnnTensorFormat_t*", format),
                     nbDims:data(),
                     filterDimA:data())

@@ -263,7 +263,7 @@ function getRNNCheckSums(miniBatch, seqLength, hiddenSize, numberOfLayers, numbe
                     linLayerBiasDesc[0],
                     minDim,
                     ffi.cast("cudnnDataType_t*", dataType),
-                    ffi.cast("cudnnDataType_t*", format),
+                    ffi.cast("cudnnTensorFormat_t*", format),
                     nbDims:data(),
                     filterDimA:data())

ngimel avatar Apr 27 '16 16:04 ngimel

I reinstalled cuDNN using the latest version available and re-installed the torch bindings, the tests worked fine for me. @ngimel doing the above works as well for me, but I'm curious to know if the test in the current state should be broken?

SeanNaren avatar Apr 27 '16 18:04 SeanNaren

@SeanNaren, please take a look at the patch I posted, the cast that is there now is wrong. On some OSs it still works fine, as your and mine examples show, on some it breaks, as for TangXing (I am curious to see if this patch solves his problem). The test in current state, on Soumith's R5 branch, should not be broken, as bindings currently in Soumith's R5 expect cudnn RC. The test on borisfom's R5 branch would be broken with cudnn v5 RC, as it expects final version of cudnn v5. But with borisfom's R5 branch you won't even be able to require 'cudnn' as it checks that cudnn version is 5005 or greater.

ngimel avatar Apr 27 '16 18:04 ngimel

Ah right makes more sense. Will wait to see if this fixes his issue, if so I can add it to the PR request I currently have open if it makes easier!

SeanNaren avatar Apr 27 '16 19:04 SeanNaren