ipex-llm icon indicating copy to clipboard operation
ipex-llm copied to clipboard

freeze modules does not reduce memory in dllib

Open emartinezs44 opened this issue 1 year ago • 2 comments

Some approaches like Lora have the objective of reducing the memory in the training phase, but in dllib it does not work as expected. Freezing the 99% of the model consumes the same memory as the model without freezing any node. It seems that the only thing that freeze does is to avoid to update the weights, but using more threads increases the memory and it sohuldn´t because weights. Any suggestion of changing this behaviour?

emartinezs44 avatar Sep 04 '23 17:09 emartinezs44

I just check the code, like linear, the gradWeight is resized to (outputSize, inputSize), but the values are all zeros. The memory comsumption is as the same as the unfreezed module. I will info you if I find any solution.

qiuxin2012 avatar Sep 06 '23 02:09 qiuxin2012

It is normal, the problem happens if you start more threads than one in training phase. Every thread creates a copy of the output of each module in the forward phase, no matter is frozen or not and that is the problem. These weights are frozen, every thread doesn´t need to store a copy of the result of forward phase in every frozen layer because its weights won´t change. Besides, it sould be the same in weigths sync. So the changes are not trivial. I will keep the ticket open if you have any suggestion.

emartinezs44 avatar Sep 06 '23 08:09 emartinezs44