nonechucks
nonechucks copied to clipboard
Pytorch's IterableDataset
Hello, I've been using this (excellent) library for a while, and I just stumbled upon a new feature in pytorch. It seems that pytorch now has an IterableDataset class that is meant to solve the exact issues that this library was trying to solve.
Is this correct? I feel like nonechucks is doing more than what can be done with the class, but it seems to me, safe dataloading and transforms as filters can be done with this (provided one's careful with the multithreading).
Could you give an example (or link) demonstrating how IterableDataset could be used to handle bad samples?
You could just not return (yield rather) the sample if it fails some check, i.e in the __iter__
method:
def __iter__(self):
for sample in samples:
if self.is_valid(sample):
yield sample
That's the rough idea at least!