torchft icon indicating copy to clipboard operation
torchft copied to clipboard

make torchft work for llama3_8b 8x

Open d4l3k opened this issue 10 months ago • 0 comments

as titled

it goes fast

Test plan:

Testing w/ 12 GB of 64 mb tensors

baseline

took 30.493701454252005 seconds

With streaming transfer

0 chunks
took 8.783997897058725 seconds

10 chunks
took 2.8615125976502895 seconds

20 chunks
took 2.433052882552147 seconds

d4l3k avatar Feb 08 '25 00:02 d4l3k