LongLM issues

Run example.py Error: Failed to modify the attention method of LlamaForCausalLM

Hello. I just simplily run the example.py and met the error in the "=====**SelfExtend using Torch**======" part: ``` Traceback (most recent call last): File "./LongLM/example.py", line 112, in SelfExtend.apply(model, group_size,...

tuzeao-tal

Passkey retrieval (needle in a haystack)

Hello, congratulations on your acceptance to ICML! I think it is well deserved! I have a question regarding the passkey retrieval task you posted. Could you briefly explain how you...

MarsJacobs

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! when resuming training

I tried to run example.py on an A100 (80GB) GPU. It seems there is a bug at line [41] https://github.com/datamllab/LongLM/blob/ee92c841eaf8c6e0989f49c2d63231ba06136345/example.py#L41 The current implementation doesn't load the input_ids tensors onto the...

humza-sami

About GPU memory usage

Dear author， I'm trying to run LongLM on a single A10 with 24G memory, I have tried 'meta-llama/Llama-2-7b-chat-hf' and failed with out of CUDA memory error(attached). ![20240726105026](https://github.com/user-attachments/assets/2f788fc4-a005-49be-918c-4daee54967af) I realized that...

zhongsanqiang

LongLM really has great potential.

1

I'm applying this to the Ghost 8B Beta (128k) chat version online [here](https://huggingface.co/spaces/lamhieu/ghost-8b-beta-128k) and it seems to work. In general, I have not yet fine-tuned and tested the parameters against...

lh0x00

TypeError: 'NoneType' object is not subscriptable

1

When trying to run predictions from a selfextended model im getting the above error TypeError: 'NoneType' object is not subscriptable but without applying selfextend() im not getting any error and...

Lilly-25

Cohere command r

1

Is it possible to adapt this to cohere command-r models ?

flaviusburca

About Experiences in Section 4.2 (Performance on Synthetic Long Context Tasks) in the paper

1

Dear authors, Regarding the experimental results in Section 4.2, I noticed that the authors compared the performance of models using SWA and models using the SelfExtend method on the passkey...

zhlgg

Memory usage

Dear Author, hello. I have a question about the memory usage. I implemented the original torch on an V100×8, but when the inference length reaches 8k, the memory is insufficient....

scar-on

How to select params for self extend? (llama3.1 and gemma2)

I need to extend the context length of llama3.1-8b from suppose the context length is 8k up to 128k. And the same for gemma2. I see that the there is...

hahmad2008

LongLM
LongLM copied to clipboard

Metadata

Run example.py Error: Failed to modify the attention method of LlamaForCausalLM

Passkey retrieval (needle in a haystack)

Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! when resuming training

About GPU memory usage

LongLM really has great potential.

TypeError: 'NoneType' object is not subscriptable

Cohere command r

About Experiences in Section 4.2 (Performance on Synthetic Long Context Tasks) in the paper

Memory usage

How to select params for self extend? (llama3.1 and gemma2)

← Metadata

Owner

Metadata

LongLM LongLM copied to clipboard

Metadata

← Metadata

Owner

Metadata

LongLM
LongLM copied to clipboard