dl-4-tsc
dl-4-tsc copied to clipboard
best_model.hdf5
Does it generate the best_model? or how is it going to work?
OSError: Unable to open file (unable to open file: name = '/data1/prjs/code/ABTS/dl_4_tsc//results/fcn/UCRArchive_2018_itr_8/Coffee/best_model.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
Yes, it uses model checkpoint.
Its only generating last_model.hdf5 and model_init.hdf5 and I am getting error for (FileNotError) best_model.hdf5
This means that your code was not executed successfully. Do you see any error when running the code ?
python=3.6.8
tensorflow=1.14
when running with python main.py UCRArchive_2018 Coffee fcn _itr_8
getting this error
Traceback (most recent call last): File "main.py", line 152, in
fit_classifier() File "main.py", line 44, in fit_classifier classifier.fit(x_train, y_train, x_test, y_test, y_true) File "/data1/prjs/code/ABTS/dl_4_tsc/classifiers/fcn.py", line 80, in fit model = keras.models.load_model(self.output_directory+'best_model.hdf5') File "/data1/prjs/code/ABTS/venv/lib/python3.6/site-packages/tensorflow/python/keras/saving/save.py", line 146, in load_model return hdf5_format.load_model_from_hdf5(filepath, custom_objects, compile) File "/data1/prjs/code/ABTS/venv/lib/python3.6/site-packages/tensorflow/python/keras/saving/hdf5_format.py", line 200, in load_model_from_hdf5 f = h5py.File(filepath, mode='r') File "/data1/prjs/code/ABTS/venv/lib/python3.6/site-packages/h5py/_hl/files.py", line 408, in init swmr=swmr) File "/data1/prjs/code/ABTS/venv/lib/python3.6/site-packages/h5py/_hl/files.py", line 173, in make_fid fid = h5f.open(name, flags, fapl=fapl) File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper File "h5py/h5f.pyx", line 88, in h5py.h5f.open OSError: Unable to open file (unable to open file: name = '/data1/prjs/code/ABTS/dl_4_tsc//results/fcn/UCRArchive_2018_itr_8/Coffee/best_model.hdf5', errno = 2, error message = 'No such file or directory', flags = 0, o_flags = 0)
This means that the model was not saved, maybe recheck the paths. If it does not work, I believe you should install TF 2.0 and work with the new version. The code works with TF 2.0 now.
With TF2.0 I am getting this
OSError: SavedModel file does not exist at: saved_model_dir/{saved_model.pbtxt|saved_model.pb}
I think it may be write permissions for the target directory, not quite sure though.
@hfawaz @shahmustafa I'm experiencing the same issue. Here is my env:
Mac OS X: 10.15.5
Python (conda): 3.8
tensorflow 2.2.0
h5py 2.10.0 py38h3134771_0
hdf5 1.10.4
keras 2.3.1
The error seems to suggest out of memory problem when the code is trying to save intermediate result in HDF, here are some clues:
https://github.com/h5py/h5py/issues/1176 https://stackoverflow.com/questions/44117315/goes-out-of-memory-when-saving-large-array-with-hdf5-py... http://www.pytables.org/cookbook/inmemory_hdf5_files.html https://www.pytables.org/cookbook/inmemory_hdf5_files.html https://stackoverflow.com/questions/40449659/does-h5py-read-the-whole-file-into-memory
In my case, the problem only arise when I started using slightly larger training data (30KB vs 1.8MB). Of course, 30kb wouldn't cause such a problem.
Here you're error log
Traceback (most recent call last):
File "main.py", line 155, in <module>
fit_classifier()
File "main.py", line 44, in fit_classifier
classifier.fit(x_train, y_train, x_test, y_test, y_true)
File "/mnt/batch/tasks/shared/LS_root/jobs/datascience-ml/azureml/resnet-timeseries_1592133278_dfbeddf7/mounts/workspaceblobstore/azureml/resnet-timeseries_1592133278_dfbeddf7/classifiers/resnet.py", line 142, in fit
y_pred = self.predict(x_val, y_true, x_train, y_train, y_val,
File "/mnt/batch/tasks/shared/LS_root/jobs/datascience-ml/azureml/resnet-timeseries_1592133278_dfbeddf7/mounts/workspaceblobstore/azureml/resnet-timeseries_1592133278_dfbeddf7/classifiers/resnet.py", line 160, in predict
model = keras.models.load_model(model_path)
File "/azureml-envs/azureml_eca0112c9008c12b467c806af1888db3/lib/python3.8/site-packages/tensorflow/python/keras/saving/save.py", line 189, in load_model
loader_impl.parse_saved_model(filepath)
File "/azureml-envs/azureml_eca0112c9008c12b467c806af1888db3/lib/python3.8/site-packages/tensorflow/python/saved_model/loader_impl.py", line 110, in parse_saved_model
raise IOError("SavedModel file does not exist at: %s/{%s|%s}" %
OSError: SavedModel file does not exist at: /mnt/batch/tasks/shared/LS_root/jobs/datascience-ml/azureml/resnet-timeseries_1592133278_dfbeddf7/mounts/workspaceblobstore/azureml/resnet-timeseries_1592133278_dfbeddf7/results/resnet/rtpcr_itr_9/qtower/best_model.hdf5/{saved_model.pbtxt|saved_model.pb}
@hfawaz Could you give us the specific version of all dependencies that works for you during publication?
@hfawaz @shahmustafa I'm experiencing the same issue. Here is my env: Mac OS X: 10.15.5 Python (conda): 3.8 tensorflow 2.2.0 h5py 2.10.0 py38h3134771_0 hdf5 1.10.4 keras 2.3.1
The error seems to suggest out of memory problem when the code is trying to save intermediate result in HDF, here are some clues:
h5py/h5py#1176 https://stackoverflow.com/questions/44117315/goes-out-of-memory-when-saving-large-array-with-hdf5-py... http://www.pytables.org/cookbook/inmemory_hdf5_files.html https://www.pytables.org/cookbook/inmemory_hdf5_files.html https://stackoverflow.com/questions/40449659/does-h5py-read-the-whole-file-into-memory
In my case, the problem only arise when I started using slightly larger training data (30KB vs 1.8MB). Of course, 30kb wouldn't cause such a problem.
Has anyone been able to fix this problem?