SubpopBench
SubpopBench copied to clipboard
Error loading MIMICNotes
I successfully downloaded MIMICNotes following the instructions provided here: https://github.com/YyzHarry/SubpopBench/blob/main/MedicalData.md#mimicnotes
When I try to train a model on MimicNotes, I got the following error when loading the features.npy file due to this line:
https://github.com/YyzHarry/SubpopBench/blob/4d3dbbe21029666ef19d040e110ec22908640c5b/subpopbench/dataset/datasets.py#L473
raise ValueError("Object arrays cannot be loaded when "
ValueError: Object arrays cannot be loaded when allow_pickle=False
I then added allow_pickle=True
in the np.load() statement above, which fixed this error. But then I get a different error due to this line:
https://github.com/YyzHarry/SubpopBench/blob/4d3dbbe21029666ef19d040e110ec22908640c5b/subpopbench/dataset/datasets.py#L478
return self.x_array[int(x), :].astype('float32')
IndexError: too many indices for array: array is 0-dimensional, but 2 were indexed
Upon inspection, self.x_array
does not look like a standard numpy ndarray but a sparse matrix in Compressed Sparse Row format.
Could you please advise how to correcty load and index this dataset?
Thanks!