DeepSpeed
DeepSpeed copied to clipboard
Add BigCode models support
This PR adds support for BigCode models. As you can see in https://github.com/microsoft/DeepSpeed/issues/3811, it's a pretty popular architecture
If you have any questions, please feel free to ask.
Also, I don't know how to add tests for these models, if someone could help me out with that, I would be very grateful.
@cupertank - is this still a PR you'd like to see completed?
Closing this PR as stale.