Zach Mueller

Results 471 comments of Zach Mueller

@winglian not quite yet! But I'll let you know for you to test :) (should be by end of this week!)

@winglian go ahead and try the branch out :) Note that it only works on single GPU for now (will look at deepspeed tommorow), and you shouldn't see a time...

Correct. I only tested on a tiny model just to get the API stable 😉

Now that it’s a bit more stable, I saw both memory decreases and speed increases when combining MS-AMP and TransformerEngine. More details are in the PR (so overall purely positives)

Correct, I'm looking into that this week

@alex-jw-brooks the idea behind this is indeed as you say :) Flag would be better, and do note that realistically `dispatch_batches` or `split_batches` shouldn't do *anything*, this is full user...

(Somewhat, currently trying to reverse engineer a few ways you did it, you guys would be *much* faster at it I imagine if you want to beat us to it...

Just as a fair warning, this will not be an immediate nor quick fix, since essentially this means every single model's calculation is off when doing `output.loss`, and every single...