Dirk Groeneveld
Dirk Groeneveld
This probably depends on #4.
This probably depends on #4.
It's inefficient that every file has to be stored on disk first. Sometimes I just want to download a file and manipulate it in memory right away.
@chris-ha458 has made some great improvements to BFF in the https://github.com/allenai/dolma repo. We should back-port those changes here, especially the ones that have to do with correctness (like the ones...