SubpopBench icon indicating copy to clipboard operation
SubpopBench copied to clipboard

Error loading MIMICNotes

Open kiranchari opened this issue 7 months ago • 0 comments

I successfully downloaded MIMICNotes following the instructions provided here: https://github.com/YyzHarry/SubpopBench/blob/main/MedicalData.md#mimicnotes

When I try to train a model on MimicNotes, I got the following error when loading the features.npy file due to this line:

https://github.com/YyzHarry/SubpopBench/blob/4d3dbbe21029666ef19d040e110ec22908640c5b/subpopbench/dataset/datasets.py#L473

raise ValueError("Object arrays cannot be loaded when "
ValueError: Object arrays cannot be loaded when allow_pickle=False

I then added allow_pickle=True in the np.load() statement above, which fixed this error. But then I get a different error due to this line:

https://github.com/YyzHarry/SubpopBench/blob/4d3dbbe21029666ef19d040e110ec22908640c5b/subpopbench/dataset/datasets.py#L478

    return self.x_array[int(x), :].astype('float32')
IndexError: too many indices for array: array is 0-dimensional, but 2 were indexed

Upon inspection, self.x_array does not look like a standard numpy ndarray but a sparse matrix in Compressed Sparse Row format.

Could you please advise how to correcty load and index this dataset?

Thanks!

kiranchari avatar Dec 02 '23 07:12 kiranchari