Improved-Dynamic-Memory-Networks-DMN-plus
Improved-Dynamic-Memory-Networks-DMN-plus copied to clipboard
Lasagne issue while running on GPU
After following your instructions and installing the prerequisite for running DMN+, I get the following error:
` (keras-dmn)user1@dpl04:~/keras/Improved-Dynamic-Memory-Networks-DMN-plus$ python main.py --network dmn_tied --mode train --babi_id 1
Using gpu device 2: GeForce GTX TITAN X (CNMeM is enabled with initial size: 98.0% of memory, CuDNN not available)
==> parsing input arguments
==> Loading test from /home/IAIS/user1/keras/Improved-Dynamic-Memory-Networks-DMN-plus/data/tasks_1-20_v1-2/en-10k/qa1_single-supporting-fact_train.txt
==> Loading test from /home/IAIS/user1/keras/Improved-Dynamic-Memory-Networks-DMN-plus/data/tasks_1-20_v1-2/en-10k/qa1_single-supporting-fact_test.txt
==> not using minibatch training in this mode
==> not used params in DMN class: ['shuffle', 'network', 'babi_id', 'batch_size', 'epochs', 'prefix', 'load_state', 'log_every', 'babi_test_id', 'save_every']
==> building input module
==> creating parameters for memory module
==> building episodic memory module (fixed number of steps: 3)
==> building answer module
==> collecting all parameters
==> building loss layer and computing updates
Traceback (most recent call last):
File "main.py", line 194, in
I used your theanorc file with adjusting the CUDA root. Thanks!
I am also getting this error and I am using
In [1]: import theano
the Using cuDNN version 7103 on context None
Mapped name None to device cuda: GeForce GTX 1080 Ti (0000:01:00.0)
In [2]: theano.__version__
Out[2]: u'1.0.1'
In [3]: import lasagne
In [4]: lasagne.__version__
Out[4]: '0.2.dev1'
The error origins from
if any(not isinstance(p, theano.compile.SharedVariable) for p in params):
raise ValueError("params must contain shared variables only. If it "
"contains arbitrary parameter expressions, then "
"lasagne.utils.collect_shared_vars() may help you.")
I am trying to learn more about this problem.
After some hacking, the code can run now:
➜ Improved-Dynamic-Memory-Networks-DMN-plus git:(master) ✗ cat tmp
diff --git a/dmn_tied.py b/dmn_tied.py
index a3241b4..848e436 100644
--- a/dmn_tied.py
+++ b/dmn_tied.py
@@ -220,10 +220,10 @@ class DMN_tied:
self.loss_l2 = 0
self.loss = self.loss_ce + self.loss_l2
-
+
#updates = lasagne.updates.adadelta(self.loss, self.params)
- updates = lasagne.updates.adam(self.loss, self.params)
- updates = lasagne.updates.adam(self.loss, self.params, learning_rate=0.0001, beta1=0.5) #from DCGAN paper
+ # updates = lasagne.updates.adam(self.loss, self.params)
+ updates = lasagne.updates.adam(self.loss, lasagne.utils.collect_shared_vars(self.params), learning_rate=0.0001, beta1=0.5) #from DCGAN paper
#updates = lasagne.updates.adadelta(self.loss, self.params, learning_rate=0.0005)
#updates = lasagne.updates.momentum(self.loss, self.params, learning_rate=0.0003)
@@ -439,7 +439,7 @@ class DMN_tied:
with open(file_name, 'w') as save_file:
pickle.dump(
obj = {
- 'params' : [x.get_value() for x in self.params],
+ 'params' : [x.get_value() for x in lasagne.utils.collect_shared_vars(self.params)],
'epoch' : epoch,
'gradient_value': (kwargs['gradient_value'] if 'gradient_value' in kwargs else 0)
},
@@ -629,7 +629,7 @@ class DMN_tied:
input_mask = input_masks[batch_index]
ret = theano_fn(inp, q, ans, input_mask)
- param_norm = np.max([utils.get_norm(x.get_value()) for x in self.params])
+ param_norm = np.max([utils.get_norm(x.get_value()) for x in lasagne.utils.collect_shared_vars(self.params)])
return {"prediction": np.array([ret[0]]),
"answers": np.array([ans]),
I am not sure if the modification is right, I will wait to see the result!