Olatunji Ruwase

Results 648 comments of Olatunji Ruwase

@inkcherry, your analysis is correct. And yes, configuration 2 is misleading. But we use `bf16_optimizer` in this case for historical reasons from the bloom176b training and because we have not...

> I have the same error. I want to know that if I use this code to modify DeepSpeed now, Can it work correctly at BF16? Yes, please try the...

@L-hongbin, are you still working on this PR?

@inkcherry, are you still interested in this PR. It seems @L-hongbin is no longer interested, so I want to close. Thanks!

@muellerzr, I am curious about the I/O speeds in your OP. Can you please confirm that you are transferring weights from NVMe to HBM at 75-90GB/sec? Are you able to...

> (Because yes, I'd love to know what planet has a 75-90GB/s non-RAID M.2 as well!) @muellerzr, thanks for the clarification. As you may have guessed fast I/O is a...

![image](https://github.com/huggingface/transformers/assets/4271600/17398177-380b-4f24-a05f-835fa4894e85) @muellerzr, your NVMe is blazingly fast, ~14GB/sec reads. May I request your contribution to the following? https://github.com/microsoft/DeepSpeed/issues/998

> @tjruwase do let me know if you see anything else odd about what I’ve done here etc too/if you have insights. I’ll look into the DeepSpeed stuff in a...