peft LORA for Embedding layer token expansion ???

LORA for Embedding layer token expansion ???

Open chapter544 opened this issue 1 year ago • 2 comments

Hi, By now we may all have heard about the Llama model. However, it was train on a limited set of languages. I want to "transfer" the Llama knowledge to a new language that it wasn't trained on, say Korean. My thought is to expand the Llama token vocab (~32K) by adding Korean tokens, resulting a bigger token embedding map. Then, I will freeze all Llama layers except the word embedding layer and the LMHead layer, to train a CausalLM model on Korean text. I only have a 3090 GPU so I am thinking about peft/LORA.

My question: is there an example or guide to do this? Like setting LORA module targets to an Embedding Layer?

Thanks,

Apr 18 '23 06:04 chapter544

LoRa is not suitable for this situation.

Apr 18 '23 08:04 EeyoreLee

Hello @chapter544, that is a nice idea and as per what you have described it might work. You can make the embedding layers additionally trainable and add lora layers to the respective attention subblocks of Llama. Please refer this comment https://github.com/huggingface/peft/issues/334#issuecomment-1514166635 on adding additional modules as trianable through LoRAConfig.

Apr 19 '23 05:04 pacman100

Thank you very much for the info.

Apr 20 '23 02:04 chapter544

Hello @chapter544, see this: https://github.com/huggingface/peft/pull/337#issuecomment-1527412343

Apr 28 '23 11:04 pacman100

peft peft copied to clipboard

LORA for Embedding layer token expansion ???

peft
peft copied to clipboard