DeepSpeed icon indicating copy to clipboard operation
DeepSpeed copied to clipboard

[REQUEST] Mixed dtype for model parameters

Open jedyang97 opened this issue 2 years ago • 2 comments

Is your feature request related to a problem? Please describe. Is it possible to support model of irregular dtypes? For example, a large multimodal LLM might have a vision encoder that is of dtype=float32 and its LLM part in dtype=bfloat16. This will be particularly helpful since some customized vision models (e.g., MinkowskiEngine) don't support float16/bfloat16.

Describe the solution you'd like Have a flag (e.g., dont_change_dtype) in DeepSpeedEngine to allow loading a nn.Module model without modifying its dtypes of various parameters (e.g., some params might be float32, while some are bfloat16)

jedyang97 avatar Nov 16 '23 03:11 jedyang97

Hello, did you find any solutions for this?

ZCMax avatar Jul 09 '24 05:07 ZCMax

This is a quite practical feature because some custom modules may not support bfloat16

HAL-42 avatar Apr 29 '25 14:04 HAL-42