Saaketh Narayan
Saaketh Narayan
@jasonkrone is this PR still needed? Or was there a resolution from that community slack thread?
Hey! If it's not too much of a hassle, mind submitting a PR with your proposed change? I'd be happy to review
Perfect, thank you @wouterzwerink! Feel free to tag me when the PR is up.
@wouterzwerink Hey, just wanted to follow up on this, mind submitting a quick PR if/when you have some time? Thanks!!
@andreamad8 @huxuan @gongel please see the `replication` argument detailed in our docs [here](https://docs.mosaicml.com/projects/streaming/en/stable/dataset_configuration/replication_and_sampling.html#replication). @huxuan We don't have an explicit example of a megatron integration, but as it's pytorch based, you...
Hey @ssharpe42 , there are two ways to address this: 1. you can write a custom encode-decode function for your samples. An example is [here](https://docs.mosaicml.com/projects/streaming/en/stable/fundamentals/dataset_conversion_guide.html) -- see the "Advanced use...
Hey @ssharpe42, just wondering if the above worked for you.
Closing out this issue as it has been inactive for a while.
Hey! So we looked into this and weren't able to reproduce the first behavior, but we were able to reproduce the second (PermissionError: [Errno 13] Permission denied: '/000000_locals'). The reason...
I just doublechecked again and the `/tmp/streaming` directory was not present when creating the first streaming dataset. Even when creating equally permissioned users, and not killing the process in an...