Video-LLaMA
Video-LLaMA copied to clipboard
Training model
I tried to perform inference on an x1 GPU 4090 with 24GB, it worked.
Now, I am trying to train this model and reduce the GPU memory usage as much as possible. I found that the for-loop code in modeling_llama.py (Line 542) increases my GPU memory usage every round.
for idx, decoder_layer in enumerate(self.layers):
if output_hidden_states:
all_hidden_states += (hidden_states,)
................
I read your paper, and it says that the Llama model should be frozen during training. I noticed that you have already assigned the params of self.llama_model to be: param.requires_grad = False. But this is slightly different from the settings of Qformer, where self.llama_model is not in eval() mode.
if freeze_qformer:
for name, param in self.Qformer.named_parameters():
param.requires_grad = False
self.Qformer = self.Qformer.eval()
self.Qformer.train = disabled_train
self.query_tokens.requires_grad = False
logging.info("freeze Qformer")
logging.info('Loading Q-Former Done')
self.llama_model = LlamaForCausalLM.from_pretrained(
llama_model,
torch_dtype=torch.bfloat16,
)
for name, param in self.llama_model.named_parameters():
param.requires_grad = False
logging.info('Loading LLAMA Done')
Emmm... If I want to perform the second-stage training on my GPU, it would be great to get some advice from you.
Can you finetuning it using 4090? thx
Yes
Can you finetuning it using 4090? thx