Olatunji Ruwase

Results 533 comments of Olatunji Ruwase

@szhengac I am really sorry to hear that this is blocking issue for over a month. My concern is that this PR creates a strange code path that could be...

Got it. Thanks for sharing this scenario. In that case, what if you called [optimizer.refresh_fp32_params()](https://github.com/microsoft/DeepSpeed/blob/24055597bafe133299900a7e243663dd92bdde2c/deepspeed/runtime/zero/stage1.py#L1031-L1032) from your client script, after you load_checkpoint() returns. Can you check if this achieves your...

@szhengac I am happy to look into the ZeRO-1 regression. Can you open an issue and provide repro steps? Regarding the original finetuning issue, here is my description of the...

@harishsg99, thanks for adding this feature. Can you please add some unit tests? Thanks!

@roywei, thanks for this PR. Can you please add some unit tests? An appropriate location would be `tests/unit/test_data.py`

@roywei, can you please address formatting issues using instructions [here](https://github.com/microsoft/DeepSpeed/blob/master/CONTRIBUTING.md#prerequisites)? Thanks.

@roywei, just wanted to check if you able to finish this PR? Thank!

@pacman100, can you please confirm what you observe with this issue. Does the run crash or simply hang? I am trying to match with my observations. Thanks!

I see this error message in the gist log. Can you confirm that pybind11 is installed?

@Santosh-Gupta, can you share more details on this, such as the model and command line?