data icon indicating copy to clipboard operation
data copied to clipboard

pointer to a similar library / feedback

Open nlgranger opened this issue 2 years ago • 1 comments

📚 The doc issue

Hi! I'm the author of a python library which is called SeqTools. It predates torchdata and provides essentially the same functionality as MapDataPipes. I just wanted to let you know about it, maybe you can pick some code or ideas out of it. For instance:

  • Saving the stack to point a runtime error back to when the transformation node was created
  • There is also a multiprocessing/multithreading prefetch function which kinda resembles a Dataloader (actually you can check this example which re-implements Dataloader for map-style datasets).

To be honest I have eventually steered away from using it in my deep learning pipelines. It is good for prototyping and pre-computing data. But in my training scripts I just refactor the transforms into a single big function. That is actually more convenient because all necessary variables (data, parameters, augmentation variables) are available in the same scope. I think this is an issue you might face as well in the future.

Suggest a potential alternative/fix

No response

nlgranger avatar Apr 12 '22 22:04 nlgranger

Hi @nlgranger stack operations look interesting. Feel free to contribute code to our library.

VitalyFedyunin avatar Jul 06 '22 18:07 VitalyFedyunin