LongLoRA issues

Something wrong with the torch version

### I followed the steps in readme but I encounted the following errors in SFT. [WARNING] async_io requires the dev libaio .so object and headers but these were not found....

dian1414

What's the trainset is used to obtain “Model with contextg extension via improved LoRA fine-tuning” (LoRA+)？

Hi, thanks for the great work. I have a question regarding the **used trainset** for different types of models (**Fully fine-tuned, Lora+, models for extra experiments in paper**). In the...

ZackZikaiXiao

How did make questions and answers for long context(LongAlpaca)?

![image](https://github.com/dvlab-research/LongLoRA/assets/55049714/132725db-b63a-42ac-af95-f3c60caacde3) How did you make answers for a set of two paper comparisons, such as the following example? Was it generated using GPT? Or was it created by a human?

ddoyles

When I set `per_device_train_batch_size=2`, the S2-Attn would not shift as expected

2

First, I ran the commands as follows: ``` CUDA_VISIBLE_DEVICES=1 torchrun --nproc_per_node=1 --master_port=29501 supervised-fine-tune.py \ --model_name_or_path /mnt/42_store/lhj/data/mllm/model_weights/Llama-2-7b-chat-hf \ --bf16 True \ --output_dir outputs \ --model_max_length 16384 \ --use_flash_attn True \ --data_path...

linhaojia13

HF models missing rope scaling in the config

Some of your uploaded huggingface models lack the parameter `rope_scaling` in the config. If we don't have `rope_scaling`, model will generate `" " " " " "`. `"rope_scaling": {"factor": 2.0,...

hsiehjackson

Machine don't install Flash Attention

![image](https://github.com/dvlab-research/LongLoRA/assets/147307433/a149481e-9bc5-4389-9058-d5e0dae83aef) My CUDA version is 11.2, so I can't install Flash Attention on my machine. I try to set use_flash_attn as False when executing fine-tune.py, I meet this error be...

huilong-chen

LongLoRA
LongLoRA copied to clipboard

Metadata

Something wrong with the torch version

What's the trainset is used to obtain “Model with contextg extension via improved LoRA fine-tuning” (LoRA+)？

How did make questions and answers for long context(LongAlpaca)?

When I set `per_device_train_batch_size=2`, the S2-Attn would not shift as expected

HF models missing rope scaling in the config

Machine don't install Flash Attention

global_step文件

Add callback for saving trainable parameters and model config

Regarding the results in Table 8 and Table 14

About the different datasets and corresponding models

← Metadata

Owner

Metadata

LongLoRA LongLoRA copied to clipboard

Metadata

← Metadata

Owner

Metadata

LongLoRA
LongLoRA copied to clipboard