pcam icon indicating copy to clipboard operation
pcam copied to clipboard

HDF5 in tensorflow 2.4

Open polaschwoebel opened this issue 3 years ago • 1 comments

Hi all, thanks for the cool dataset! I am trying to use it in tensorflow and have come across the following problem: Newer versions of keras (the one shipped with tf 2.4) don't seem to include HDF5Matrix anymore, and when using older code I get a warning referring me to the new HDF5 functionality in tensorflow I/O: https://www.tensorflow.org/io/api_docs/python/tfio/v0/IODataset#from_hdf5

However, trying to use train_dataset = tfio.v0.IODataset.from_hdf5(xpath, '/camelyonpatch_level_2_split_train_x', tf.int64) I get the following error: tensorflow.python.framework.errors_impl.InvalidArgumentError: slice index 0 of dimension 0 out of bounds. [Op:StridedSlice] name: IOFromHDF5/HDF5IODataset/strided_slice/

Any ideas why this is? Could it be the file itself that doesn't come in the right shape? Thanks!

polaschwoebel avatar Mar 18 '21 09:03 polaschwoebel

Hi @polaschwoebel , I faced the same problem and was able to load the data using python's h5py library. The code goes like this:

x_filename = "camelyonpatch_level_2_split_train_x.h5"
y_filename = "camelyonpatch_level_2_split_train_y.h5"
h5X = h5py.File(x_filename, 'r')
h5y = h5py.File(y_filename, 'r')
X = np.array(h5X.get('x'))
y = np.array(h5y.get('y')).reshape([-1, 1])

I found this code from: https://github.com/alexmagsam/metastasis-detection/blob/master/data.py

MohammedHAlali avatar Nov 11 '21 17:11 MohammedHAlali