bug-fixed
bug-fixed
The problem will exist in `Zero3-offload`. It seems the problem lies in the partition parameter's part in Zero3 if the model has multiple parallel modules or frozen parameters, the offload...
@tjruwase please try this example (https://github.com/haotian-liu/LLaVA/blob/main/scripts/v1_5/finetune.sh) with zero3-offload. Thanks.
@jomayeri , thanks for the response. The file needed in the script can be downloaded in here: https://huggingface.co/liuhaotian/llava-v1.5-mlp2x-336px-pretrain-vicuna-13b-v1.5/tree/main. Unfortunately, I think it's difficult for me to prepare a more concise...
@tjruwase I have updated my comment, please kindly check it. Thanks.
> @bug-fixed Does the same thing happen when you offload to CPU? @jomayeri The machine I'm working on has very limited memory and is shared with others. it is difficult...