trl icon indicating copy to clipboard operation
trl copied to clipboard

Add gradient accumulation

Open edbeeching opened this issue 1 year ago • 1 comments

With larger sequences and batches, we quickly go out of memory when the batch size is greater than 1.

edbeeching avatar Mar 14 '23 08:03 edbeeching

We could probably make use of the accelerate context manager for gradient accumulation!

lvwerra avatar Mar 14 '23 08:03 lvwerra