ipex-llm icon indicating copy to clipboard operation
ipex-llm copied to clipboard

Fix llama

Open ivy-lv11 opened this issue 1 year ago • 3 comments

Fix dtype mismatch error when load_in_low_bit='bf16' CPU and GPU both: RuntimeError: expected m1 and m2 to have the same dtype, but got: float != c10::BFloat16 and details could be found in the issue https://github.com/analytics-zoo/nano/issues/1111

ivy-lv11 avatar Apr 01 '24 05:04 ivy-lv11

@rnwang04 Please take a look at it?

hkvision avatar Apr 01 '24 07:04 hkvision

This PR can fix llama, but will other model meet similar issues ? Shall we add torch_dtype=torch.bfloat16 for load_in_low_bit='bf16' as we did for fp16 too ?

rnwang04 avatar Apr 01 '24 07:04 rnwang04

This PR can fix llama, but will other model meet similar issues ? Shall we add torch_dtype=torch.bfloat16 for load_in_low_bit='bf16' as we did for fp16 too ?

Yes

ivy-lv11 avatar Apr 03 '24 02:04 ivy-lv11