CNN_sentence
CNN_sentence copied to clipboard
NotImplementedError: The image and the kernel must have the same type.inputs
ubgpu@ubgpu:~/github/CNN_sentence$ sudo python conv_net_sentence.py -nonstatic -word2vec
Using gpu device 0: GeForce GTX 970
loading data... data loaded!
model architecture: CNN-non-static
using: word2vec vectors
[('image shape', 64, 300), ('filter shape', [(100, 1, 3, 300), (100, 1, 4, 300), (100, 1, 5, 300)]), ('hidden_units', [100, 2]), ('dropout', [0.5]), ('batch_size', 50), ('non_static', True), ('learn_decay', 0.95), ('conv_non_linear', 'relu'), ('non_static', True), ('sqr_norm_lim', 9), ('shuffle_batch', True)]
Traceback (most recent call last):
File "conv_net_sentence.py", line 317, in
Hi,andyyuan78, I met the same problem, have your fixed it?
no response yet;
i am not try to fix it
Try forcing the type of the word matrices created in process_data.py to be float32 ...
from
- W = np.zeros(shape=(vocab_size+1, k))
- W[0] = np.zeros(k)
to
- W = np.zeros(shape=(vocab_size+1, k), dtype='float32')
- W[0] = np.zeros(k, dtype='float32')
not work yet:
ubgpu@ubgpu:~/github/CNN_sentence$ sudo python conv_net_sentence.py -nonstatic -word2vec
Using gpu device 0: GeForce GTX 970
loading data... data loaded!
model architecture: CNN-non-static
using: word2vec vectors
[('image shape', 64, 300), ('filter shape', [(100, 1, 3, 300), (100, 1, 4, 300), (100, 1, 5, 300)]), ('hidden_units', [100, 2]), ('dropout', [0.5]), ('batch_size', 50), ('non_static', True), ('learn_decay', 0.95), ('conv_non_linear', 'relu'), ('non_static', True), ('sqr_norm_lim', 9), ('shuffle_batch', True)]
Traceback (most recent call last):
File "conv_net_sentence.py", line 317, in
- W = np.zeros(shape=(vocab_size+1, k))
- W[0] = np.zeros(k)
- W = np.zeros(shape=(vocab_size+1, k), dtype='float32')
- W[0] = np.zeros(k, dtype='float32')
i = 1 for word in word_vecs: W[i] = word_vecs[word] ubgpu@ubgpu:~/github/CNN_sentence$
even I chang it to float64, not work yet
ubgpu@ubgpu:~/github/CNN_sentence$ sudo python conv_net_sentence.py -nonstatic -word2vec
Using gpu device 0: GeForce GTX 970
loading data... data loaded!
model architecture: CNN-non-static
using: word2vec vectors
[('image shape', 64, 300), ('filter shape', [(100, 1, 3, 300), (100, 1, 4, 300), (100, 1, 5, 300)]), ('hidden_units', [100, 2]), ('dropout', [0.5]), ('batch_size', 50), ('non_static', True), ('learn_decay', 0.95), ('conv_non_linear', 'relu'), ('non_static', True), ('sqr_norm_lim', 9), ('shuffle_batch', True)]
Traceback (most recent call last):
File "conv_net_sentence.py", line 317, in
- W = np.zeros(shape=(vocab_size+1, k))
- W[0] = np.zeros(k)
- W = np.zeros(shape=(vocab_size+1, k), dtype='float64')
- W[0] = np.zeros(k, dtype='float64')
i = 1 for word in word_vecs: W[i] = word_vecs[word] ubgpu@ubgpu:~/github/CNN_sentence$
When including floatX=float32 in the THEANO_FLAGS, I met the same issue. However, without specifying floatX, the code works on my Mac Book Pro. However, Yoon's code doesn't print out running durations. CNN on GPU by using cuDNN 3 seems not faster than running on CPU - need benchmark the duration differences later.
Having this same problem on many system configurations. I've yet to get the code working with GPU at all. Can Someone that has a working setup post their configuration?
changing floatX to float32 doesn't do anything, @leocnj I believe the reason you did not see a speed improvement is because when you switch to float64 although the error goes away, the GPU still fails to be utilized. You can check GPU utilization with nvidia-smi command.
--Im trying to run this on an amazon g2 instance (GRID gpu)
For run this code on gpu (float32) you need to modify,
process_data.py
lin 55, W = np.zeros(shape=(vocab_size+1, k), dtype='float32')
lin 56, W[0] = np.zeros(k, dtype='float32')
conv_net_sentence.py
lin 82, set_zero = theano.function([zero_vec_tensor], updates=[(Words, T.set_subtensor(Words[0,:], zero_vec_tensor))], allow_input_downcast=True)
lin131, val_model = theano.function([index], classifier.errors(y),
givens={
x: val_set_x[index * batch_size: (index + 1) * batch_size],
y: val_set_y[index * batch_size: (index + 1) * batch_size]}, allow_input_downcast=True)
lin 137, test_model = theano.function([index], classifier.errors(y),
givens={
x: train_set_x[index * batch_size: (index + 1) * batch_size],
y: train_set_y[index * batch_size: (index + 1) * batch_size]}, allow_input_downcast=True)
lin 141, train_model = theano.function([index], cost, updates=grad_updates,
givens={
x: train_set_x[index_batch_size:(index+1)_batch_size],
y: train_set_y[index_batch_size:(index+1)_batch_size]}, allow_input_downcast=True)
lin 155, test_model_all = theano.function([x,y], test_error, allow_input_downcast=True)
My results on GPU (THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python conv_net_sentence.py -static -rand): epoch 1, train perf 60.150289 %, val perf 58.105263 epoch 2, train perf 72.936416 %, val perf 67.368421 epoch 3, train perf 75.213873 %, val perf 63.473684 epoch 4, train perf 87.803468 %, val perf 70.947368 epoch 5, train perf 93.248555 %, val perf 70.421053 Looping 5 times took 133.631452 seconds For CPU (THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python conv_net_sentence.py -static -rand): epoch 1, train perf 60.774566 %, val perf 58.842105 epoch 2, train perf 72.994220 %, val perf 67.263158 epoch 3, train perf 74.809249 %, val perf 62.947368 epoch 4, train perf 88.080925 %, val perf 69.473684 epoch 5, train perf 92.751445 %, val perf 69.894737 Looping 5 times took 690.696883 seconds cv: 0, perf: 0.716417910448
I had modified the file as the way @manuelvargas760 said, It works! Note: after modified process_data.py, you should run the process_data.py to get new model parameters and word vectors
Feel free push if you've modified to code get GPU working, and I'll make sure to merge :)