LoRA issues

Results 106 LoRA issues

Sort by recently updated

Possible wrong implementation in loralib layers

Hi, In loralib's layer modules, https://github.com/microsoft/LoRA/blob/33b953630763c6299d2349abc8f154a3951a7984/loralib/layers.py#L138 It seems like `eval() `function which merges W+BA is never called. This is because when changing the model to evaluation mode in torch by...

jichan3751

Where is the LoRA matrices saved?

I used the bash roberta_base_sst2.sh to reproduce the result. But I can' t find the LoRA matrix checkpoint. Ideally, there should be a 3.4MB bin file which contains the weights...

KelsenMa

Parameter count on GPT-2 medium

According to the paper, the lora parameter count for GPT-2 medium is 0.35M, but since the hidden dimension is 1024, and the model have 24 layers, with a rank=4, the...

Heimine

question for scale！

i didn't find anything about the input of data（“scale”）. i would like to know how to change the input of "scale" to adjust the method of lora.

Duperr

[Question about multi-gpu training]

when I try to train NLG model on multi-gpu,I use this: ``` python -m torch.distributed.launch --nproc_per_node=2 --use_env src/gpt2_ft.py \ --train_data ./data/e2e/train.jsonl \ --valid_data ./data/e2e/valid.jsonl \ --train_batch_size 8 \ --grad_acc 1...

FindTheTruth

LORA on T5 model

I am trying to use LORA on a loaded Checkpoint of a CodeT5 model. However when I do, the run time is about the same, and my result is not...

vivektreddy

lora-dim == lora-r ?

Hi I am studing on LoRA and thanks for your work. I have a simple question, which is really confusing me. Dose the two hyper-parameters **lora-dim** of the GPT-2 model...

ardand4708

Reproduce Lora results is close but not accurate

Getting this result For these Hyperprams: https://github.com/microsoft/LoRA/tree/main/examples/NLG#replicating-our-result-on-e2e What were the hyperparameters for results in the paper: https://github.com/microsoft/LoRA/tree/main/examples/NLG#adapting-gpt-2-using-lora

harsh306

After joining Lora, the first few layers show a gradient of 0

I am a beginner in deep learning and I would like to know if the reason for the gradient to be 0 is due to the vanishing gradient or if...

hluckye

Guidance Needed on Continuing Training with a New Dataset via LoRA

I have recently completed training a model using LoRA (referred to as LoRA-1) with Dataset A. I am now considering how best to proceed with training on a new Dataset...

TanTaiLi

LoRA
LoRA copied to clipboard

Metadata

Possible wrong implementation in loralib layers

Where is the LoRA matrices saved?

Parameter count on GPT-2 medium

question for scale！

[Question about multi-gpu training]

LORA on T5 model

lora-dim == lora-r ?

Reproduce Lora results is close but not accurate

After joining Lora, the first few layers show a gradient of 0

Guidance Needed on Continuing Training with a New Dataset via LoRA

← Metadata

Owner

Metadata

LoRA LoRA copied to clipboard

Metadata

← Metadata

Owner

Metadata

LoRA
LoRA copied to clipboard