Olatunji Ruwase comments

Results 533 comments of


                                            Olatunji Ruwase

About the performance of Using NVMe SSd.

@luckyq, should this issue remain open? Thanks!

[REQUEST] When training a FP16 model, the ability to set some of the layers to FP32

@BlinkDL, thanks for your question. If I understand correctly, it seems there are two parts to this. First, when you say `f(x)` is `float16` and `g(x)` is `float32`, I believe...

[REQUEST] When training a FP16 model, the ability to set some of the layers to FP32

@BlinkDL, it is great to hear you have (1) and (2) working. For us to understand the required DeepSpeed support, can you share an example that already incorporates (1) and...

[REQUEST] ZeRO-Infinity: GPU Memory Usage Higher Than Expected

@kiehls90 and @Seong-yeop, nvidia-smi is an imprecise method for tracking memory usage. Can you please use deepspeed's `see_memory_usage()` as described [here](https://github.com/microsoft/DeepSpeed/issues/1437#issuecomment-937981281)? Please share your logs. Thanks!

[REQUEST] ZeRO-Infinity: GPU Memory Usage Higher Than Expected

@kiehls90, in general it is really hard to comment on memory usage without more precise profiling using something like `see_memory_usage()`. My usual approach to debugging is to instrument before/after forward...

[REQUEST] ZeRO-Infinity: GPU Memory Usage Higher Than Expected

@kiehls90, did you make progress with this? Or is it no longer an issue? Thanks!

[REQUEST] ZeRO-Infinity: GPU Memory Usage Higher Than Expected

@kiehls90, is this still an issue, or can we close? Thanks!

[BUG] FP16 used for all reduce even if BFLOAT16 is enabled

@owmohamm, can you please try PR #2145?

Activation Checkpointing conflicts with Weight Sharing

@iyupan, thanks for reporting this issue. To help investigate this, can you please provide repro steps? Also, please clarify the expected behavior in this case. Should each parameter gradient be...

Bad performance when there are lots of optim_groups (for example, using layer-wise learning rate)

@BlinkDL, can you provide details to repro?