deepreplay icon indicating copy to clipboard operation
deepreplay copied to clipboard

ValueError: Unable to create group (name already exists) problem

Open clevilll opened this issue 2 years ago • 5 comments

Hi recently I faced a problem when I used this package as its Traceback is following:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-43-c3b5d8180301> in <module>()
----> 1 model.fit(X, y, epochs=50, batch_size=16, callbacks=[replay])

2 frames
/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py in error_handler(*args, **kwargs)
     65     except Exception as e:  # pylint: disable=broad-except
     66       filtered_tb = _process_traceback_frames(e.__traceback__)
---> 67       raise e.with_traceback(filtered_tb) from None
     68     finally:
     69       del filtered_tb

/usr/local/lib/python3.7/dist-packages/deepreplay/callbacks.py in on_train_begin(self, logs)
     83         self.n_epochs = self.params['epochs']
     84 
---> 85         self.group = self.handler.create_group(self.group_name)
     86         self.group.attrs['samples'] = self.params['samples']
     87         self.group.attrs['batch_size'] = self.params['batch_size']

/usr/local/lib/python3.7/dist-packages/h5py/_hl/group.py in create_group(self, name, track_order)
     63             name, lcpl = self._e(name, lcpl=True)
     64             gcpl = Group._gcpl_crt_order if track_order else None
---> 65             gid = h5g.create(self.id, name, lcpl=lcpl, gcpl=gcpl)
     66             return Group(gid)
     67 

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/_objects.pyx in h5py._objects.with_phil.wrapper()

h5py/h5g.pyx in h5py.h5g.create()

ValueError: Unable to create group (name already exists)

I shared the google colab notebook which I could run on Aug 2, 2019, but now it threw out this KeyError: 'samples' in 2021. Please check the notebook and feel free to run it for quick debugging.

Following is the configuration and package versions in google colab:

matplotlib==3.2.2
matplotlib-inline==0.1.3
matplotlib-venn==0.11.6
numpy==1.19.5
pandas==1.1.5
pandas-datareader==0.9.0
pandas-gbq==0.13.3
pandas-profiling==1.4.1
scikit-learn==1.0.1
scipy==1.4.1
seaborn==0.11.2
sklearn-pandas==1.8.0
3.7.12
Python 3.7.12

This is the full code:

from keras.models import Sequential
from keras.layers import Dense
#from keras.optimizers import SGD
from tensorflow.keras.optimizers import SGD
from keras.initializers import glorot_normal, normal

model = Sequential()
model.add(Dense(input_dim=2,
                units=2,
                activation='sigmoid',
                kernel_initializer=glorot_normal(seed=42),
                name='hidden'))
model.add(Dense(units=1,
                activation='sigmoid',
                kernel_initializer=normal(seed=42),
                name='output'))

model.compile(loss='binary_crossentropy', optimizer=SGD(learning_rate=0.05), metrics=['acc'])

from deepreplay.callbacks import ReplayData
from deepreplay.datasets.parabola import load_data
from deepreplay.replay import Replay

X, y = load_data()

replay = ReplayData(X, y, filename='hyperparms_in_action.h5', group_name='part1')
model.fit(X, y, epochs=50, batch_size=16, callbacks=[replay])

I found the similar open issues here & here related to saving model via hdf5 file. I tried based on some suggestions 12195# and here to save the model with tf instead of h5 in replay_filename which was unsuccessful. There is lots of post in this regard in SoF Any helps will be highly appreciated.

clevilll avatar Dec 06 '21 15:12 clevilll

Hi,

If you run the code multiple times, it raises this error, because it doesn't overwrite the data inside a given group contained in the H5 file. So, you have two options:

  • delete the .h5 file and run it again
  • change the group name every time you run the same code (replay = ReplayData(X, y, filename='hyperparms_in_action.h5', group_name='part1_001') Hopefully this will solve your issue.

Best, Daniel

dvgodoy avatar Dec 07 '21 16:12 dvgodoy

Hi,

If you run the code multiple times, it raises this error, because it doesn't overwrite the data inside a given group contained in the H5 file. So, you have two options:

  • delete the .h5 file and run it again
  • change the group name every time you run the same code (replay = ReplayData(X, y, filename='hyperparms_in_action.h5', group_name='part1_001') Hopefully this will solve your issue.

Best, Daniel

Thanks for reply. I picked the 2nd option and replace following scripts to dynamically change the name when I run the cell:

import datetime as dt
dtime = dt.time()
now = dt.datetime.now()
zeit = now.strftime("%Y-%m-%d %H:%M:%S")
print(f"Last update of notebook:{zeit}")

X, y = load_data()

#replay = ReplayData(X, y, filename='hyperparms_in_action.h5', group_name='part1')
replay = ReplayData(X, y, filename='hyperparms_in_action.h5', group_name=f'part1_{zeit}')

but still, those errors mentioned in the issue remain and couldn't figure out what's wrong when I run them in Google colab notebook.

clevilll avatar Dec 08 '21 13:12 clevilll

@dvgodoy is there any update from your side?

clevilll avatar Dec 18 '21 21:12 clevilll

Hi,

The error you got Key Error: 'samples' is the same one from issue #29, please follow the instructions I wrote there to downgrade some of the packages, then restart the kernel, and you should be able to run it.

Hope it helps. Best, Daniel

dvgodoy avatar Apr 28 '22 18:04 dvgodoy

Hi, still can't solve it after downgrading. Please see the Google colab notebook.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
[<ipython-input-11-13c5c2ef98eb>](https://localhost:8080/#) in <module>
     17                 name='output'))
     18 # Compile the model
---> 19 model.compile(loss='binary_crossentropy', optimizer=SGD(learning_rate=0.05), metrics=['acc'])
     20 
     21 from deepreplay.callbacks import ReplayData

1 frames
[/usr/local/lib/python3.7/dist-packages/keras/engine/training.py](https://localhost:8080/#) in compile(self, optimizer, loss, metrics, loss_weights, sample_weight_mode, weighted_metrics, target_tensors, **kwargs)
     92                 `optimizer`, `loss`, `metrics` or `sample_weight_mode`.
     93         """
---> 94         self.optimizer = optimizers.get(optimizer)
     95         self.loss = loss or []
     96         self.metrics = metrics or []

[/usr/local/lib/python3.7/dist-packages/keras/optimizers.py](https://localhost:8080/#) in get(identifier)
    766     else:
    767         raise ValueError('Could not interpret optimizer identifier: ' +
--> 768                          str(identifier))

ValueError: Could not interpret optimizer identifier: <tensorflow.python.keras.optimizer_v2.gradient_descent.SGD object at 0x7fb0ce9413d0>

clevilll avatar Aug 31 '22 14:08 clevilll