keras-molecules icon indicating copy to clipboard operation
keras-molecules copied to clipboard

Getting tables.exceptions.HDF5ExtError: HDF5 error back trace

Open vinayakumarr opened this issue 7 years ago • 9 comments

When I tried to run a program by executing python preprocess.py data/smiles_50k.h5 data/processed.h5. it is generating an error. The detailed error is attached in the image. How to correct this? untitled

vinayakumarr avatar Jun 17 '17 16:06 vinayakumarr

Because the files are larger than 50MB, they are stored with git lfs

You need to install git lfs https://git-lfs.github.com/

then run

git lfs get

to download the files

alainrichardt avatar Jun 18 '17 22:06 alainrichardt

Now it is giving different error when i tried to run the

sudo python train.py data/processed.h5 model.h5 --epochs 20

Using Theano backend. Traceback (most recent call last): File "train.py", line 65, in main() File "train.py", line 43, in main model.create(charset, latent_rep_size = args.latent_dim) File "/home/sachin/vinay/chemistry/keras-molecules/molecules/model.py", line 23, in create _, z = self._buildEncoder(x, latent_rep_size, max_length) File "/home/sachin/vinay/chemistry/keras-molecules/molecules/model.py", line 81, in _buildEncoder return (vae_loss, Lambda(sampling, output_shape=(latent_rep_size,), name='lambda')([z_mean, z_log_var])) File "/usr/local/lib/python2.7/dist-packages/keras/engine/topology.py", line 585, in call output = self.call(inputs, **kwargs) File "/usr/local/lib/python2.7/dist-packages/keras/layers/core.py", line 659, in call return self.function(inputs, **arguments) File "/home/sachin/vinay/chemistry/keras-molecules/molecules/model.py", line 68, in sampling epsilon = K.random_normal(shape=(batch_size, latent_rep_size), mean=0., std = epsilon_std) TypeError: random_normal() got an unexpected keyword argument 'std'

vinayakumarr avatar Jun 19 '17 04:06 vinayakumarr

This is due to a change in the Keras API, the parameter std has been changed to std_dev

Change the code and submit a pull request :)

alainrichardt avatar Jun 19 '17 21:06 alainrichardt

Yes, I had corrected. I think you are using data and label as same in both train and test (in train.py line n0=54). Why? Also, you are giving the testing data as validation data? Is there any separate program to calculate the accuracy on test data set? I want to know whether the code does a classification or prediction?

According to me it is a kind of prediction, am i right?

vinayakumarr avatar Jun 21 '17 10:06 vinayakumarr

I'm a lurker in this repo - I dont use the train/test code

alainrichardt avatar Jun 21 '17 15:06 alainrichardt

You're right, the latter should be "data_test". In general, "train_gen.py" should be used instead, it should be less demanding on your machine.

I wouldn't call an autoencoder or a VAE as a classification or prediction. Instead, I would call it as representation learning, a la https://hips.seas.harvard.edu/blog/2013/02/04/predictive-learning-vs-representation-learning/

pechersky avatar Jun 21 '17 19:06 pechersky

For the record, if you are using the latest version of TensorFlow with Keras, the API has changed std => stddev

delton137 avatar Jul 27 '17 19:07 delton137

One way to resolve the exception is to checkout / download / replace the data files.

dtchang avatar Oct 26 '17 19:10 dtchang

Getting an error, when I tried to run

python preprocess.py data/smiles_500k.h5 data/processed_500.h5

File "preprocess.py", line 85, in main() File "preprocess.py", line 72, in main apply_fn=lambda ch: np.array(map(one_hot_encoded_fn, File "preprocess.py", line 63, in create_chunk_dataset chunks=tuple([chunk_size]+list(dataset_shape[1:]))) File "/home/vinay/chemistrytensor/local/lib/python2.7/site-packages/h5py/_hl/group.py", line 105, in create_dataset dsid = dataset.make_new_dset(self, shape, dtype, data, **kwds) File "/home/vinay/chemistrytensor/local/lib/python2.7/site-packages/h5py/_hl/dataset.py", line 76, in make_new_dset if isinstance(chunks, tuple) and (-numpy.array([ i>=j for i,j in zip(tmp_shape,chunks) if i is not None])).any(): TypeError: The numpy boolean negative, the - operator, is not supported, use the ~ operator or the logical_not function instead. untitled

vinayakumarr avatar Dec 17 '17 07:12 vinayakumarr