FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

The effect of lora finetune with difference target_modules

Open lucasjinreal opened this issue 2 years ago • 3 comments

Hello, I using default target_modules q_proj, v_proj the result looks good.

Will it more good if I using more target_modules trainable? such as :

 "q_proj",
            "v_proj",
            "down_proj",
            "gate_proj",
            "up_proj",

lucasjinreal avatar Jun 01 '23 03:06 lucasjinreal

I heard that people also get good results fine-tuning "fc1" and "fc2" modules from this paper:

"we conclude that modifying head attention shows the best results when the parameter budget is very small, while the FFN can better utilize modifications at larger capacities."

BabyChouSr avatar Jun 01 '23 19:06 BabyChouSr

will fc1 and fc2 add additional params?

lucasjinreal avatar Jun 02 '23 06:06 lucasjinreal

Yes, I believe it does add additional parameters.

BabyChouSr avatar Jun 03 '23 03:06 BabyChouSr