data icon indicating copy to clipboard operation
data copied to clipboard

[Nodes] Add a ToDevice node, or combine with pin memory

Open andrewkho opened this issue 1 year ago • 2 comments

🚀 The feature

We should add a node that will send batches to device (probably one at a time). We could either separate this, add it on to pre-fetcher (ie always call .to(device) on the head of the queue, or maybe part of pin-memory

Motivation, pitch

Sending data to device can be slow, and often users want this done in a background thread. DataLoader should do this in the backgroudn as it consolidates state management

Alternatives

No response

Additional context

No response

andrewkho avatar Dec 13 '24 20:12 andrewkho

I wonder, how different can this be from doing the transfer within a Mapper, similar to a collate_fn doing tensor.to(device)

divyanshk avatar Dec 13 '24 20:12 divyanshk

For cases where we have multiple threads reading from data, we might be able to create multiple thread local CUDA streams to transfer data onto the GPU. WDYT @andrewkho ?

divyanshk avatar Dec 26 '24 18:12 divyanshk