long_llama icon indicating copy to clipboard operation
long_llama copied to clipboard

LongLLaMA is a large language model capable of handling long contexts. It is based on OpenLLaMA and fine-tuned with the Focused Transformer (FoT) method.

Results 18 long_llama issues
Sort by recently updated
recently updated
newest added

How would you finetune in this style with an instruction finetuning data set like Open-Orca?

In your paper, you say > Position Interpolation (PI, [Chen et al., 2023] and [kaiokendev, 2023]) introduces a modification to the rotary positional encoding scheme that enables fine-tuning for 32K...

That sounds massively interesting, and while we try to run inference and read the paper, should we expect the release of the finetuning code?

If i use faiss as a Memory, during the inference,calculating each token requires 3(becase there are 3 memory attention layers) knn search, right? Will the generation speed become very slow?

How's the speed droping when length get large compare with vanilla llama?

https://arxiv.org/abs/2307.02486 Scaling to 1 billion context length paper in addition to this seems like it would solve the pursuit of infinite context length. Also FoT feels similar to L2P learn...

Hi! It is a great work and I'm very interested in FoT. But I'm curious about how it compares to RAG techniques. For example, would it be better to use...

It feels like it's not really about expanding the context window, but rather enhancing it through the key-value pairs stored during training as external knowledge. This means that once the...

Hi, I am going through the page:https://huggingface.co/syzymon/long_llama_code_7b_instruct. I found the text "All inputs were truncated and randomly padded (left/right) to 3072 tokens" under Training. Is there a reason behind this...

Hi, Thank you for this great effort. I am trying to use your 3B m-instruct-v1_1 model to evaluate on my custom long-context QA dataset with context length up to 200k....