CNN_sentence
CNN_sentence copied to clipboard
Save trained model
Hi, can you add the method of saving trained model for future prediction ?
Same question.Did you find a way?
Once you are happy with your trained network you can use python pickle module to serialize the object in in a file using the method dump
and for loading use the method load
.
python pickle : https://docs.python.org/2/library/pickle.html
In this code, which object should I serialize? I've tried to use the method dump to serialize 'classifier',but it didn't work. PicklingError: Can't pickle <type 'instancemethod'>
@guanxingke you should serialize the object named "params" which stored the parameter in the model. When you predict a new document classification, you should reconstruct the neutral network and set the parameter as what you have saved
@guanxingke you can use what @MarkWuNLP has mentioned above. Instead I reconstructed the code to a class having two methods.
def save(self, path):
with open(path, 'wb') as f:
pickle.dump(self, f, -1)
logger.info('save model to path %s' % path)
return None
@classmethod
def load(self, path):
with open(path, 'rb') as f:
return pickle.load(f)
@MarkWuNLP Thanks for your answer. I still have some questions. There are two params,①params in function train_conv_net,②classifier.params. I assume that you mean the first one.When should I serialize the params? And when should I load the params?
this is my code:
params = classifier.params
for conv_layer in conv_layers:
params += conv_layer.params
if non_static:
#if word vectors are allowed to change, add them as model parameters
params += [Words]
# f = open("params.save", "wb")
# cPickle.dump(params, f)
# f.close()
# print 'Params saved.'
print "Loading params..."
fr = open("params.save","rb")
params = cPickle.load(fr)
fr.close()
I save the params first, and load it next time. It doesn't work:
Loading params...
Traceback (most recent call last):
File "D:\Program Files (x86)\JetBrains\PyCharm Community Edition 4.5.4\helpers\pydev\pydevd.py", line 2358, in
If I load the params before the "while (epoch < n_epochs):" loop, I don't see how the params could effect my result.
@Huarong Thanks for your answer. I assume that you mean the class is "class MLPDropout". And I save the class like this :
classifier.save('classifier.save')
It's the same error I mention before:
Traceback (most recent call last):
File "D:\Program Files (x86)\JetBrains\PyCharm Community Edition 4.5.4\helpers\pydev\pydevd.py", line 2358, in
@guanxingke No, I mean both cnn parameters and mlp features. I used CPickle to stored the params successfully.
savefile = file('obj.save', 'wb') cPickle.dump(params,savefile,protocol=cPickle.HIGHEST_PROTOCOL)
Furthermore, when you use the stored feature. conv_layer.W_conv and conv_layer.b_conv should be reset as well as self.dropout_layers.W self.dropout_layers.b
@MarkWuNLP I stored the params successfully too, but how do I use it to predict what I write? Could you show some examples? Like when and how I load the params, and make some further predict. Thanks again.
@ChasonLee I have the same problem as you. Did you find an answer to your last question ? Any example ? Thanks
@huydan , @ChasonLee Did you guys find any example ?_?
@ChasonLee , @huydan , @Muugii-bs Hey guys, I have made some modifications in the code so that further predictions can be made on some test examples. You can found it here: https://github.com/DeepanwayGhosal/CNN_sentence Let me know if this works.
Thank you @DeepanwayGhosal 👍 I have a question. Where in the code, do you load the saved parameters ?
@Muugii-bs I don't actually save the parameters. Along with train and validation set I also pass the test set in the function train_conv_net() and it returns predicted test labels.
In the train_conv_net() function you can find a code snippet between these two comment lines which predicts the test labels.
# So we make prediction only by taking a maximum of 2000 test examples at a time ...... ...... ....... #start training over mini-batches
@huydan mentioned in the his last comment that he wants a prediction example. So I have only done that here. I will try to make a model which saves and loads the parameters and can make predictions even without passing the test set in the train_conv_net() function.
hi,I think I know how to save and load the model. you can save the params of the model. and load it by set_value() function. example save: with open('./modelfile', 'wb') as f: cPickle.dump(classifier.params, f, -1)
load: with open('./model/classifier4class_all20160728.params', 'rb') as f: tmp = cPickle.load(f) for i in range(len(classifier.params)): classifier.params[i].set_value(tmp[i].get_value())
of course, before load the classifier, you should define it, just like the train process.
@deepanwayx @521088684 can you please explain more about how the code should change, if we want to save the trained model and load it in the test time to predict the sentiment for one instance only?
@ChasonLee did you find any example on how you should save and then predict based on your saved model? Can you please share those examples?
@521088684 Can you please specify where exactly I should dump and where I should load the model? I have dumped it at the end of this if: if val_perf >= best_val_perf:
and loaded it after
classifier = MLPDropout(rng, input=layer1_input, layer_sizes=hidden_units, activations=activations, dropout_rates=dropout_rate)
#define parameters of the model and update functions using adadelta
params = classifier.params
for conv_layer in conv_layers:
params += conv_layer.params
if non_static:
#if word vectors are allowed to change, add them as model parameters
params += [Words]
and off course removed the training part and tested it on my test set.
But the problem is that when I am applying it on my test data most of them would be assigned to the negative class, that is not interpretable based on the accuracy that I have got.
is there someone who has already successfully implemented how to save and load the model?? how to save the trained model and load it in the test time to predict the sentiment for one instance only?
Same request. If someone has worked this out successfully, it would be good to get some insight.
Hi,
for me the suggestion from @521088684 works. What I actually did was, that I added to the MLPDropout-class the following method: `
def save(self, path):
with open(path, 'wb') as f:
pickle.dump(self.params, f, -1)
return None
Then in the train_conv_net()-method you can insert
classifier.save("/.../")` anywhere after the initialization (ideally after checking if the current model is the best one).
Then I added to the conv_net_sentences.py file a new method where I actually copied the train_conv_net()-method with a few changes. the first part of this method looks like ` rng = np.random.RandomState(3435) img_h = len(datasets[0])-1 filter_w = img_w feature_maps = hidden_units[0] filter_shapes = [] pool_sizes = [] for filter_h in filter_hs: filter_shapes.append((feature_maps, 1, filter_h, filter_w)) pool_sizes.append((img_h-filter_h+1, img_w-filter_w+1))
'''
define model architecture
'''
index = T.lscalar()
x = T.matrix('x')
y = T.ivector('y')
Words = theano.shared(value=U, name="Words")
zero_vec_tensor = T.vector()
zero_vec = np.zeros(img_w)
set_zero = theano.function([zero_vec_tensor], updates=[(Words, T.set_subtensor(Words[0, :], zero_vec_tensor))],
allow_input_downcast=True)
layer0_input = Words[T.cast(x.flatten(), dtype="int32")].reshape((x.shape[0], 1, x.shape[1], Words.shape[1]))
conv_layers = []
layer1_inputs = []
for i in xrange(len(filter_hs)):
filter_shape = filter_shapes[i]
pool_size = pool_sizes[i]
conv_layer = LeNetConvPoolLayer(rng, input=layer0_input, image_shape=(batch_size, 1, img_h, img_w),
filter_shape=filter_shape, poolsize=pool_size, non_linear=conv_non_linear)
layer1_input = conv_layer.output.flatten(2)
conv_layers.append(conv_layer)
layer1_inputs.append(layer1_input)
layer1_input = T.concatenate(layer1_inputs, 1)
hidden_units[0] = feature_maps*len(filter_hs)
classifier = MLPDropout(rng, input=layer1_input, layer_sizes=hidden_units, activations=activations,
dropout_rates=dropout_rate)
'''
define parameters of the model and update functions using adadelta
'''
params = classifier.params
for conv_layer in conv_layers:
params += conv_layer.params
if non_static:
# if word vectors are allowed to change, add them as model parameters
params += [Words]
cost = classifier.negative_log_likelihood(y)
dropout_cost = classifier.dropout_negative_log_likelihood(y)
grad_updates = sgd_updates_adadelta(params, dropout_cost, lr_decay, 1e-6, sqr_norm_lim)
'''
load model parameters
'''
with open(params_file, 'rb') as f:
tmp = cPickle.load(f)
for i in range(len(classifier.params)):
classifier.params[i].set_value(tmp[i].get_value())
del tmp
'''
prepare datasets
2) split x and y axis
'''
np.random.seed(3435)
test_set_x = datasets[:, :img_h]
test_set_y = np.asarray(datasets[:, -1], "int32")
'''
models
'''
test_pred_layers = []
test_size = batch_size # modified line
test_layer0_input = Words[T.cast(x.flatten(), dtype="int32")].reshape((test_size, 1, img_h, Words.shape[1]))
for conv_layer in conv_layers:
test_layer0_output = conv_layer.predict(test_layer0_input, test_size)
test_pred_layers.append(test_layer0_output.flatten(2))
test_layer1_input = T.concatenate(test_pred_layers, 1)
test_y_pred = classifier.predict(test_layer1_input)
test_y_pred_p = classifier.predict_p(test_layer1_input)
test_y_pred_p_reduce = test_y_pred_p[:, 0]
test_error = T.mean(T.neq(test_y_pred, y))
test_model_all = theano.function([x, y], test_error, allow_input_downcast=True)
test_predict = theano.function([x], test_y_pred, allow_input_downcast=True)
test_probs = theano.function([x], test_y_pred_p_reduce, allow_input_downcast=True)
` Afterwards you only have to split into batches and perform the test_-functions :-)
@pexmar could you give examples of how to use these test_-functions? I have no idea about their functions. Thank you so much
@Zero0one1 Did you succeed to use these functions ?
@Zero0one1 Did you succeed to use these functions ?
No. I saved the model successfully but don't know how to predict new input. Do you have any idea?