composer
composer copied to clipboard
[ckpt-rewr] Get Model State Dict Util Function
What does this PR do?
Adds an API for extracting model state dict
Is this just pulling out existing code into a helper fn?
@eracah it would be great to get slightly more description so I know what parts to carefully read over and what is less important
It's detailed in the design doc, but I can copy and paste it in if you want
Is this just pulling out existing code into a helper fn? @eracah it would be great to get slightly more description so I know what parts to carefully read over and what is less important
It's detailed in the design doc, but I can copy and paste it in if you want
It's easier when reviewing PRs to either link to the right part of design doc or copy paste description
Is this just pulling out existing code into a helper fn? @eracah it would be great to get slightly more description so I know what parts to carefully read over and what is less important
It's detailed in the design doc, but I can copy and paste it in if you want
It's easier when reviewing PRs to either link to the right part of design doc or copy paste description
Ok added description
few nits, one larger comment/question.
I'm guessing there are a lot of tests that could be simplified by using the functionality that this PR adds. Is that true? If so, is it worth trying to do at least some of that as part of this PR? It would also implicitly test the functionality being added more, since those would be real uses cases.
Yes in theory, we would want to do that. However, in this PR we are just adding the API to be used in a script; we aren't actually adding this code to be used in Trainer. So until we swap out state.py state dict generation code for this one, we don't need to change those other tests or add any E2E tests.