lovekdl
lovekdl
When I used LISA to fintune Llama-2-7b on alpaca-gpt4-en with one a100 80G,the used memory increased sharply and exceeded 80G. I want to know how to solve this problem... Config:...
@neteroster Hello. I have the same problem with you, have you solved it?
@research4pan Currently, I use deepspeed + lora for llama-2-7b fine tuning, and memory consumption is normal now. But when I use lisa without deepspeed, I still have the problem of...
There might be some optimizer problem, I think. If I set self.freeze_all_layers() in the __init__() of class DynamicLayerActivationCallback(TrainerCallback), the memory consumption is normally 17G.
It seems that each time new layers are activated, the memory consumption may increase, and layers activated again will not increase the memory consumption.