`InMemoryDataset`'s `load()`method is obsolete
🛠 Proposed Refactor
Update InMemoryDataset's load()method for it to be compatible with latest datasets (e.g. TUDatasets) that store 4 elements in the processed data.pd object (data, slices, sizes, data_cls).
Right now it is only compatible with 3 elements (data, slices, data_cls) or 2 (which is called backward compatibility, with only data and slices). When implementing a custom InMemoryDataset that might be of any size 2<=x<=4, the default load()method is then obsolete with the newest options.
Suggest a potential alternative/fix
An easy fix could be to allow 4 elements in the loading method of InMemoryDataset, something like:
def load(self, path: str) -> None:
r"""Load the dataset from the file path :obj:`path`."""
out = fs.torch_load(path)
assert isinstance(out, tuple)
assert len(out) >= 2 and len(out) <= 4
if len(out) == 2: # Backward compatibility (1).
data, self.slices = out
elif len(out) == 3: # Backward compatibility (2).
data, self.slices, data_cls = out
else: # TU Datasets store additional element (__class__) in the processed file
data, self.slices, sizes, data_cls = out
if not isinstance(data, dict): # Backward compatibility.
self.data = data
else:
self.data = data_cls.from_dict(data)
You are correct that TUDataset.load is not applicable due to additonal information being stored. That's why we use custom logic in TUDataset to load the dataset. I personally think this is fine (since it is a rare case).