Fix llama

Open ivy-lv11 opened this issue 1 year ago • 3 comments

Fix dtype mismatch error when load_in_low_bit='bf16' CPU and GPU both: RuntimeError: expected m1 and m2 to have the same dtype, but got: float != c10::BFloat16 and details could be found in the issue https://github.com/analytics-zoo/nano/issues/1111

Apr 01 '24 05:04 ivy-lv11

@rnwang04 Please take a look at it?

Apr 01 '24 07:04 hkvision

This PR can fix llama, but will other model meet similar issues ? Shall we add torch_dtype=torch.bfloat16 for load_in_low_bit='bf16' as we did for fp16 too ?

Apr 01 '24 07:04 rnwang04

This PR can fix llama, but will other model meet similar issues ? Shall we add torch_dtype=torch.bfloat16 for load_in_low_bit='bf16' as we did for fp16 too ?

Yes

Apr 03 '24 02:04 ivy-lv11