amitagh comments

Results 8 comments of


                                            amitagh

trafficstars

On merging the PT Lora Adaptor to base model it increased the size to almost double to that of orig base model.

Hi @hiyouga Any input on this? Saw the above with gemma too. though after PT in the PT output it was showing adaptor files , after i execute merge the...

On merging the PT Lora Adaptor to base model it increased the size to almost double to that of orig base model.

I dont see merge_lora_to_base_model() called to merge the model and the adaptor. For pretraining looks like trainer.train() generates the final model and not the pretraining. Can you please confirm? Below...

On merging the PT Lora Adaptor to base model it increased the size to almost double to that of orig base model.

This happens due to following config: lora_modules_to_save: embed_tokens lm_head Without this it works fine. There was Peft Library issue for this but it still seems to be there. https://github.com/huggingface/trl/issues/1287

On merging the PT Lora Adaptor to base model it increased the size to almost double to that of orig base model.

Adaptor size of 7 GB is taking too long to merge. almost over 15 mins and still stuck in merging. had to kill it. most likely it generated FP32 version...

How to fix this "RuntimeError: cu_seqlens_q must have shape (batch_size + 1)"

was there any final fix for this issue or just downgrading transformer to lower version was the only option?

Preprocess with debug gives error.

You are correct. After placing --debug after yml file it works. (-100, 128000) ###(-100, 14711) System(-100, 744) : (-100, 512) You(-100, 2675) are(-100, 527) an(-100, 459) AI(-100, 15592) assistant(-100, 18328)...

Merge lora adaptor leads to saving the model in FP32 though base model is of FP16/BF16

code mistake. not an issue.

Merge lora adaptor leads to saving the model in FP32 though base model is of FP16/BF16

This is an issue still there. earlier i thought this is an issue with my script but it is not. axolotl is saving in FP32 at the end of pretraining...