DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

Hybrid Engine Refactor and Llama Inference Support

Open cmikeh2 opened this issue 1 year ago • 0 comments

This PR introduces a number of features and bugfixes:

  • The Hybrid Engine integration with Containers has been refactored. Models that support the Hybrid Engine now inherit from a feature container, either the HybridEngineContainer itself or something more specialized for the particular model architecture.
  • Llama support for both inference and RLHF training acceleration with Hybrid Engine support
  • Additional BF16 compilation support
  • Additional unit test coverage for new operators and data types
  • Clean up of unused code

cmikeh2 avatar May 01 '23 22:05 cmikeh2