vaex
vaex copied to clipboard
hdf5 file not able to read in Vaex, from Azure Blob storage
Hi
I am using HSDS to create hdf5 file in Azure Blob storage, as below.
fHSDS = h5pyd.File(HSDS_PATH + FILE_NAME, "w") dset_hsds = fHSDS.create_dataset(DATASET_NAME, (NUM_ROWS,NUM_COLS), dtype='float64', maxshape=(None,NUM_COLS), chunks=(CHUNK_SIZE[0], CHUNK_SIZE[1])) for iRow in range(0, NUM_ROWS, CHUNK_SIZE[0]): dset_hsds[iRow:iRow+CHUNK_SIZE[0]-1, :] = randomData[iRow:iRow+CHUNK_SIZE[0]-1, :] fHSDS.close()
Using Vaex, whenever I am trying to read same hdf5 file from blob, using below code, I am getting "FileNotFoundError: /blob_name/home/testFile_fromPython.h5"
df = vaex.open("/blob_name/home/testFile_fromPython.h5", fs=fs)
in above code if I try to read parquet/csv, I am able to read a file using Vaex, as a Data frame.
Same scenario with local: When I am creating hdf5 file in local and read same file using Vaex, I am able to read the hdf5 file as a Data frame.
Please help me, to read hdf5 from Azure blob storage.
Thanks in advance.
Don't know if it should matter, but can you change your extension to .hdf5
?
Also, can you please format your code, it is very hard to figure out what is happening right now.