Saaketh Narayan

Results 51 comments of Saaketh Narayan

Hey, thanks for raising this issue. Your current solution which filters in the dataloader will work, but as you say, it may be slow and can pose issues with batch...

Hey @universome, this is really useful! Filtering with StreamingDataset is something the team has thought about, but yes, it is hard to have filtering in conjunction with elastic determinism and...

Hey y’all, thanks for bringing this issue to our attention. We’re looking into this and will get back to you soon.

Skimmed through the blog and PyTorch issue, is this an issue particular to Streaming or is it on the PyTorch side? StreamingDataLoader is a simple (stateful) subclass of PyTorch’s DataLoader....

So Streaming is designed for fast random sample access, from shards that live on disk. Samples, outside of dataloader prefetching, are never kept in memory. We conserve RAM to do...

@XiaohanZhangCMU is this ready for another round of reviewing? would be good to get it in

@XiaohanZhangCMU what remaining changes do we need here?

Closing out this issue as it has been inactive for a while.

Hey @gaow0007, support for sample-level permissioning for datasets is not currently planned on being supported by Streaming. Closing out this issue since it's been inactive for a while.