data
data copied to clipboard
pointer to a similar library / feedback
📚 The doc issue
Hi!
I'm the author of a python library which is called SeqTools. It predates torchdata and provides essentially the same functionality as MapDataPipes
.
I just wanted to let you know about it, maybe you can pick some code or ideas out of it. For instance:
- Saving the stack to point a runtime error back to when the transformation node was created
- There is also a multiprocessing/multithreading prefetch function which kinda resembles a Dataloader (actually you can check this example which re-implements Dataloader for map-style datasets).
To be honest I have eventually steered away from using it in my deep learning pipelines. It is good for prototyping and pre-computing data. But in my training scripts I just refactor the transforms into a single big function. That is actually more convenient because all necessary variables (data, parameters, augmentation variables) are available in the same scope. I think this is an issue you might face as well in the future.
Suggest a potential alternative/fix
No response
Hi @nlgranger stack operations look interesting. Feel free to contribute code to our library.