long_llama icon indicating copy to clipboard operation
long_llama copied to clipboard

LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.

Results 18 long_llama issues
Sort by recently updated
recently updated
newest added

I am interested in loading Long Llama with Mojo Framework as mentioned here https://github.com/tairov/llama2.mojo to increase the model speed while applying 4-bit quantization for model compression. Could you provide guidance...

Could you give me contact to you? I copied code, moved model and input both to GPU.And my results are some lisp without any sense...

How much vram needed to finetune 3b model? Is 12gb enough?

Hi, I saw in the paper mentioning that C_curr and C_prev from the same document in the batch, but didn't really see how this is implemented. It seems that in...

I don’t know much about how cross-batch data is loaded during training.

Thanks for your awesome work! There is a small problem: when I fine-tune long_llama with gradient_checkpointing, it raises an error: ![image](https://github.com/CStanKonrad/long_llama/assets/55051961/ec56d425-d0bc-45f6-be34-b62501562795) Could you please update the code in transformers to...

Hi, Can you provide the code or more detail into how you zero-shot evaluate Arxiv dataset? I cannot get a good result when trying the arxiv summarization. I guess it...

I have a doubt about the rotary positional encoding part of the code. your code : ``` def rotate_as_if_first(x, rotary_emb): # x: [bs, num_attention_heads, seq_len, head_size] # apply rotary as...