FastChat The effect of lora finetune with difference target

The effect of lora finetune with difference target_modules

Open lucasjinreal opened this issue 2 years ago • 3 comments

Hello, I using default target_modules q_proj, v_proj the result looks good.

Will it more good if I using more target_modules trainable? such as :

 "q_proj",
            "v_proj",
            "down_proj",
            "gate_proj",
            "up_proj",

Jun 01 '23 03:06 lucasjinreal

I heard that people also get good results fine-tuning "fc1" and "fc2" modules from this paper:

"we conclude that modifying head attention shows the best results when the parameter budget is very small, while the FFN can better utilize modifications at larger capacities."

Jun 01 '23 19:06 BabyChouSr

will fc1 and fc2 add additional params?

Jun 02 '23 06:06 lucasjinreal

Yes, I believe it does add additional parameters.

Jun 03 '23 03:06 BabyChouSr

FastChat FastChat copied to clipboard

The effect of lora finetune with difference target_modules

FastChat
FastChat copied to clipboard