data icon indicating copy to clipboard operation
data copied to clipboard

Roadmap for mixed chain of multithread and multiprocessing pipelines?

Open npuichigo opened this issue 2 years ago • 2 comments

🚀 The feature

pypeln has a nice feature to chain pipelines which may run on different kind of workers including process, thread or asyncio.

data = (
    range(10)
    | pl.process.map(slow_add1, workers=3, maxsize=4)
    | pl.thread.filter(slow_gt3, workers=2)
    | pl.sync.map(lambda x: print x)
    | list
)

image

I remembered that in the first proposal of pytorch/data, it claims to support something alike. I'd like to ask if it's still planed and the concrete roadmap.

Motivation, pitch

Initial proposed

Alternatives

No response

Additional context

No response

npuichigo avatar Jun 14 '23 07:06 npuichigo

@ejguan

npuichigo avatar Jun 15 '23 17:06 npuichigo

Sorry for the late response. TBH, this has been in our long-term roadmap when we createdTorchData project. But, unfortunately, me and @NivekT are not working on TorchData anymore. Stay tuned on the update later.

ejguan avatar Jun 15 '23 17:06 ejguan