bicleaner-ai icon indicating copy to clipboard operation
bicleaner-ai copied to clipboard

Add gradient accumulation option for bicleaner-ai training

Open radinplaid opened this issue 1 year ago • 1 comments

The wiki suggests a batch size of 128 is recommended for 'stable training'.

It would be helpful to have the option to accumulate gradients so that bicleaner-ai training with larger "effective batch size" were possible on GPUs with a relatively small amount of RAM.

Fairseq calls this option "--update-freq" Sockeye calls this option "--update-interval"

radinplaid avatar Oct 10 '23 12:10 radinplaid

Hi @radinplaid, I agree and I've been thinking of it since I did the tool. Unfortunately Tensorflow does not support it natively, so it would require us to replace the tensorflow training loop function with our handmade function. Maybe at some point I'll will have time to implement it. I'm gladly to accept PRs if someone wants to write it.

ZJaume avatar Nov 06 '23 15:11 ZJaume