DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

Support model declaration in zero.Init context

Open tohtana opened this issue 1 year ago • 1 comments

This PR aims to support model declaration in zero.Init context by applying the following improvements mainly done by @eisene in #3203.

  • Partition parameters of a class dynamically defined inside zero.Init() (followed the improvement proposed by @eisene)
  • Fixed condition in tests to check if parameter partitioning is properly applied (test_zero_nesting_init.py, test_zero_dynamic_class.py) (followed the improvement proposed by @eisene and added deepspeed.initialize to catch possible errors)
  • Restore zero.Init() context after deepspeed.initialize() (validated by new test: test_zero_nesting_init.py::TestShutdownInNestingInit::test_shutdown_in_nesting_init) (followed @tjruwase's suggestion)

The main difference with #3203 is that this PR solves the issue pointed by @tjruwase. Regarding the issue, this PR also enables zero.Init() contexts to restore the state (patching functions) after deepspeed.initialize().

tohtana avatar May 22 '23 18:05 tohtana

Hi @eisene, thank you for your proposal! I like your approach using annotations.

The only concern is that your PR seems to have changes for multiple purposes now. So I would like to apply this PR right now to fix the related issues including yours (#3202). Then, can you submit another PR to implement your approach using annotations?

tohtana avatar Jun 02 '23 22:06 tohtana

Apologies for the delay. I'll submit a new PR this week only you let me know that you don't need it anymore.

eisene avatar Jun 20 '23 19:06 eisene

Apologies for the delay. I'll submit a new PR this week only you let me know that you don't need it anymore.

Hi @eisene, are you submitting a PR for your approach using annotation? If so, shouldn't we merge this PR first so that you can make your change consistent with other fixes?

tohtana avatar Jun 20 '23 22:06 tohtana

Agreed, since this one's already approved.

eisene avatar Jun 20 '23 22:06 eisene

@eisene Thank you for your reply! I will merge this PR as soon as possible. I just restarted tests because the previous tests seemed to fail due to some CI issue.

tohtana avatar Jun 20 '23 22:06 tohtana

Agreed, since this one's already approved.

Hi @eisene, this PR was finally merged. Thank you for your waiting. Now you are ready to work on your proposal.

tohtana avatar Jun 26 '23 22:06 tohtana

Thanks. I made a new pull request #3865 that has the decorator logic.

eisene avatar Jul 03 '23 15:07 eisene