spark-deep-learning icon indicating copy to clipboard operation
spark-deep-learning copied to clipboard

Issue in saving Keras model into DBFS folder

Open nareshr8 opened this issue 5 years ago • 4 comments

Hi team, I am using Azure Databricks and doing some pipelining using spark and model using keras and tensorflow. Recently I had to update my cluster from 5.4 to 6.2. The model failed to save since then. It fails with an error message "Operation not supported".

I reported the same to h5py team here. @danzafar was kind enough to respond suggesting to try to save in tmp directly instead of dbfs location. It worked.

He also suggested that It should work if I use tf.Keras model. I am actually using the same. Actually, If I just use h5py directly and try to save some data in DBFS location, still its failing.

Can someone help us out.

nareshr8 avatar Jan 03 '20 06:01 nareshr8

@nareshr8 - I apologize, I thought this worked with tf.Keras but it looks like I was mistakes. If you do this using MLflow, which was developed on Databricks, it should work just fine. Thanks!

danzafar avatar Jan 29 '20 22:01 danzafar

@danzafar Thanks for letting know..

nareshr8 avatar Feb 04 '20 04:02 nareshr8

Hey, Do we have anything on this. Looks like its still not resolved, I am also facing the same issue.

Juggernaut1997 avatar Sep 01 '21 10:09 Juggernaut1997

a little hack that I found here : https://stackoverflow.com/questions/67017306/unable-to-save-keras-model-in-databricks

save locally in /tmp model.save('/tmp/model.h5')

then copy the model to DBFS dbutils.fs.cp("file:/tmp/model.h5", "dbfs:/tmp/model.h5") display(dbutils.fs.ls("/tmp/model.h5"))

copy file from DBFS and load it

dbutils.fs.cp("dbfs:/tmp/model.h5", "file:/tmp/model.h5")
from tensorflow import keras
model2 = keras.models.load_model("/tmp/model.h5")

Athena75 avatar Feb 08 '22 18:02 Athena75