BELLE icon indicating copy to clipboard operation
BELLE copied to clipboard

To fine-tune how much gpu is required for the BELLE-7B-2M model, I am now fine-tuning the error memory overflow reported on the a100

Open Amy234543 opened this issue 1 year ago • 6 comments

To fine-tune how much gpu is required for the BELLE-7B-2M model, I am now fine-tuning the error memory overflow reported on the a100

Amy234543 avatar Mar 31 '23 06:03 Amy234543

share your code

weberrr avatar Mar 31 '23 07:03 weberrr

image @weberrr

Amy234543 avatar Mar 31 '23 08:03 Amy234543

A100 80G显存够微调 BELLE-7B-2M非量化的模型吗?量化版的模型微调后不能正确回答@weberrr

Amy234543 avatar Mar 31 '23 08:03 Amy234543

8卡16g能否train动BELLE-7B-2M模型?

whywhy258 avatar Apr 03 '23 06:04 whywhy258

8卡16g能否train动BELLE-7B-2M模型?

可以尝试deepspeed采用offload cpu的配置,不然16g的显存是不够的

xianghuisun avatar Apr 05 '23 00:04 xianghuisun

8卡16g能否train动BELLE-7B-2M模型?

可以尝试deepspeed采用offload cpu的配置,不然16g的显存是不够的

你好,请问我用lora的方式train起来了 但是loss到第二步就是0了,这是为啥呢

whywhy258 avatar Apr 06 '23 08:04 whywhy258

8卡16g能否train动BELLE-7B-2M模型?

可以尝试deepspeed采用offload cpu的配置,不然16g的显存是不够的

你好,请问我用lora的方式train起来了 但是loss到第二步就是0了,这是为啥呢

我们已经更新了代码,基于deepspeed-chat完善。您可基于最新的代码进行实验。

xianghuisun avatar Apr 25 '23 13:04 xianghuisun