pyod icon indicating copy to clipboard operation
pyod copied to clipboard

How can I save a pyod model?

Open singyaowu opened this issue 5 years ago • 13 comments

I've just trained a auto-encoder model, and I wonder how can I save the model so that I don't need to train it again next time I want it. I didn't see any function related to save a model in auto_encoder.py, so I'm not sure if there is a function which I can use to save my model. Do you implement this kind of function?

singyaowu avatar May 08 '19 09:05 singyaowu

Agreed that a model save functionality should be added. Marked as a todo task. I am not sure whether pickle will work or not (hopefully yes), and I will also do some tests.

yzhao062 avatar May 08 '19 15:05 yzhao062

When trying to save AutoEncoder model using Pickle, Following error occurs. Any idea how can I fix it?

TypeError: can't pickle _thread.RLock objects

#Code
clf = fit_model(X_train)
pickle.dump(clf, open('./autoencoder.h5', 'wb'))

osancus avatar Aug 06 '19 19:08 osancus

@epicsol-inc sorry for the late response. AE in pyod is written with keras, and saving the model can be tricky.

To my understanding, keras models may not be pickable (https://github.com/keras-team/keras/issues/10528)...

If saving model is a must, you may have to copy the code out from auto_encoder.py directly. Sorry for the inconvenience..

yzhao062 avatar Aug 14 '19 14:08 yzhao062

@epicsol-inc I managed to save it using dill (https://pypi.org/project/dill/), which has syntax very similar to pickle

with open(out_fname, 'wb') as f: dill.dump(model, f, dill.HIGHEST_PROTOCOL)

You can check if it works in your case

sbysiak avatar Aug 14 '19 14:08 sbysiak

@sbysiak Thanks for the note. Much appreciated. Will also check out it and consider add this to the documentation :)

yzhao062 avatar Aug 15 '19 19:08 yzhao062

Any news regarding save PyOD models? I need to save an IForest model, can I use Pickle?

lgo7 avatar Oct 16 '19 17:10 lgo7

Any news regarding save PyOD models? I need to save an IForest model, can I use Pickle?

Sorry I have not tested it out which should be. If pickle is not working, I will say using "https://pypi.org/project/dill/" as mentioned above.

This will be listed on the top of my priority list now.

yzhao062 avatar Oct 16 '19 18:10 yzhao062

I've used picke.dump and worked!

lgo7 avatar Oct 24 '19 23:10 lgo7

I've also used pickle.dump() for the classifiers knn, oc-svm, iforest and fabod and it works saving and loading them with:

save: pickle.dump(clf, open(folder + clf_name + '.h5', 'wb')) load: pickle.loads(open( folder + 'k Nearest Neighbors (kNN).h5', 'rb').read())

AlexDelPab avatar Mar 12 '20 07:03 AlexDelPab

Pickle and dill can save successfully. But these formats can make it time consuming to load the model. For autoencoder model, I saved the weights as HDF5 and the classifier object as pickle for faster loads and less disk space.

from pyod.models.auto_encoder import AutoEncoder
autoenModel= AutoEncoder()
autoenModel.fit(X=x_train)


##serialize model to JSON
model_json = autoenModel.model_.to_json()
with open(model_path+".json", "w") as json_file:
  json_file.write(model_json)
##serialize weights to HDF5
autoenModel.model_.save_weights(model_path+"model.h5")

##then set autoencoder model to None. It makes it smaller

autoenModel.model_ = None
with open(newpath+"//"+model_name+"_model"+'.pickle', 'wb') as handle:
  pickle.dump(autoenModel, handle, protocol=pickle.HIGHEST_PROTOCOL)

Model Load

##load the auto encoder instance with open(path + "//" + model_n+"_model" + ".pickle", 'rb') as handle: loaded_model = pickle.load(handle)

# load json and create model
json_file = open(path + "//" + model_n + '.json', 'r')

loaded_model_json = json_file.read()
loaded_model_json = loaded_model_json.replace("\"ragged\": false,", " ")
json_file.close()
loaded_model_ = model_from_json(loaded_model_json)
# load weights into new model
loaded_model_.load_weights(path + "//" + model_n + "model.h5")
print("Loaded model from disk")

loaded_model.model_ = loaded_model_   ## Set the loaded model to the auto encoder instance model

This works almost 5x faster and model size is 10X smaller.

bhowmiks avatar Apr 17 '20 16:04 bhowmiks

loaded_model_ = model_from_json(loaded_model_json)

what is model_from_json? this https://www.tensorflow.org/api_docs/python/tf/keras/models/model_from_json ?

ezzeldinadel avatar Feb 09 '21 20:02 ezzeldinadel

I have tried with .pkl and .h5 extension along with dill, pickle and joblib but the issue persists

Unable to save model can't pickle _thread.RLock objects

SaqlainHussainShah avatar Jul 07 '21 12:07 SaqlainHussainShah

Pickle and dill can save successfully. But these formats can make it time consuming to load the model. For autoencoder model, I saved the weights as HDF5 and the classifier object as pickle for faster loads and less disk space.

from pyod.models.auto_encoder import AutoEncoder
autoenModel= AutoEncoder()
autoenModel.fit(X=x_train)


##serialize model to JSON
model_json = autoenModel.model_.to_json()
with open(model_path+".json", "w") as json_file:
  json_file.write(model_json)
##serialize weights to HDF5
autoenModel.model_.save_weights(model_path+"model.h5")

##then set autoencoder model to None. It makes it smaller

autoenModel.model_ = None
with open(newpath+"//"+model_name+"_model"+'.pickle', 'wb') as handle:
  pickle.dump(autoenModel, handle, protocol=pickle.HIGHEST_PROTOCOL)

Model Load

##load the auto encoder instance with open(path + "//" + model_n+"_model" + ".pickle", 'rb') as handle: loaded_model = pickle.load(handle)

# load json and create model
json_file = open(path + "//" + model_n + '.json', 'r')

loaded_model_json = json_file.read()
loaded_model_json = loaded_model_json.replace("\"ragged\": false,", " ")
json_file.close()
loaded_model_ = model_from_json(loaded_model_json)
# load weights into new model
loaded_model_.load_weights(path + "//" + model_n + "model.h5")
print("Loaded model from disk")

loaded_model.model_ = loaded_model_   ## Set the loaded model to the auto encoder instance model

This works almost 5x faster and model size is 10X smaller.

Hi! Where do you import that function model_from_json ? thx

lfvillavicencio avatar Jun 22 '22 01:06 lfvillavicencio