NostalgiaOfTime comments

Results 5 comments of


                                            NostalgiaOfTime

[BUG]What is the meaning of setting "actor_lora_module_name" without "only_optimize_lora"?

Is any update, I still confuse about it

[BUG]4张80G的A100好像不能支持基于lora的7b bloom在batch为4的条件下训练，Colossalai是可以的，比较困惑，我对比了一下，batch只能设置到1

@ScottishFold007 我看源码好像默认是使用float16，且gradient_checkpointing和only_optimize_lora不能同时使用，官方开源的代码想要LoRA就必须得放弃gradient_checkpointing。按道理说不应该站这么大显存的，毕竟用了LoRA，每层只有两个缩放矩阵需要BP，实际优化的参数量很小

[BUG] step3 AssertionError: --{actor,critic}_gradient_checkpointing and --only_optimizer_lora cannot be enabled at the same time.

the source code have write that "gradient_checkpointing" and "only_optimize_lora" cannot be used at the same time. so you can decrease the size of batch or delete only_optimize_lora

4张80G的A100好像不能支持基于lora的7b bloom在batch为4的条件下训练，Colossalai是可以的，比较困惑，我对比了一下，batch只能设置到1

@yaozhewei 目前框架好像gradient checkpointing和only_optimize_lora不能同时使用

what(): CUDA error: an illegal memory access was encountered

same，when I use Zero3 and the error is occur, and I run correctly in Zero1