convert_checkpoint_to_lsg icon indicating copy to clipboard operation
convert_checkpoint_to_lsg copied to clipboard

Convert llama to Long llama

Open jakebonk opened this issue 2 years ago • 1 comments
trafficstars

Do you have any thoughts on converting llama/alpaca/vicuna models with LSG? Or would this be more difficult. It looks like they use rotary positional embeddings (RoPE) instead of absolute positional embeddings

jakebonk avatar Apr 16 '23 17:04 jakebonk

Hi @jakebonk

HF team added the Llama model few days ago. From what I see in this implementation it is likely possible to add the LSG attention to a Llama model. RoPE isn't a problem as you can apply it before computing the score matrix.

I need to investigate this, I'll let you know within the week if I can create a conversion script.

ccdv-ai avatar Apr 16 '23 19:04 ccdv-ai