LongLM icon indicating copy to clipboard operation
LongLM copied to clipboard

[ICML'24 Spotlight] LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning

Results 20 LongLM issues
Sort by recently updated
recently updated
newest added

Hello. I just simplily run the example.py and met the error in the "=====**SelfExtend using Torch**======" part: ``` Traceback (most recent call last): File "./LongLM/example.py", line 112, in SelfExtend.apply(model, group_size,...

Hello, congratulations on your acceptance to ICML! I think it is well deserved! I have a question regarding the passkey retrieval task you posted. Could you briefly explain how you...

I tried to run example.py on an A100 (80GB) GPU. It seems there is a bug at line [41] https://github.com/datamllab/LongLM/blob/ee92c841eaf8c6e0989f49c2d63231ba06136345/example.py#L41 The current implementation doesn't load the input_ids tensors onto the...

Dear author, I'm trying to run LongLM on a single A10 with 24G memory, I have tried 'meta-llama/Llama-2-7b-chat-hf' and failed with out of CUDA memory error(attached). ![20240726105026](https://github.com/user-attachments/assets/2f788fc4-a005-49be-918c-4daee54967af) I realized that...

I'm applying this to the Ghost 8B Beta (128k) chat version online [here](https://huggingface.co/spaces/lamhieu/ghost-8b-beta-128k) and it seems to work. In general, I have not yet fine-tuned and tested the parameters against...

When trying to run predictions from a selfextended model im getting the above error TypeError: 'NoneType' object is not subscriptable but without applying selfextend() im not getting any error and...

Is it possible to adapt this to cohere command-r models ?

Dear authors, Regarding the experimental results in Section 4.2, I noticed that the authors compared the performance of models using SWA and models using the SelfExtend method on the passkey...

Dear Author, hello. I have a question about the memory usage. I implemented the original torch on an V100×8, but when the inference length reaches 8k, the memory is insufficient....

I need to extend the context length of llama3.1-8b from suppose the context length is 8k up to 128k. And the same for gemma2. I see that the there is...