DeepSpeed
DeepSpeed copied to clipboard

Published 20 hours ago •

Reame
Issues

Fix missing scale attributes for GPTJ

Open cmikeh2 opened this issue 1 year ago • 0 comments

This PR fixes two regressions introduced in the DeepSpeed chat release for GPT-J:

Checks for the scale attribute on all parameters before accessing.
Changes workspace offsets to avoid scenario where we are double using a buffer and over-writing data.

Apr 15 '23 18:04 cmikeh2