LoRA
LoRA copied to clipboard
Code for loralib, an implementation of "LoRA: Low-Rank Adaptation of Large Language Models"
The current Conv1d (and Conv3d) is not working due to the incompatible shape of (lora_A @ lora_B). I changed only the lora_B's initialization. The shape of lora_B now depends on...
While reading the /examples/NLU/examples/text-classification/run_glue.py file, I noticed that the GLUE dataset only uses the validation set for generating results and does not measure accuracy on the evaluation set. Would it...
in lora paper section 3: Adapter Layers Introduce Inference Latency :There are many variants of adapters. We focus on the original design by Houlsby et al. (2019) which has two...
how to improve the memory ability of lora fine tuning?
```python def reset_parameters(self): nn.Embedding.reset_parameters(self) if hasattr(self, 'lora_A'): # initialize A the same way as the default for nn.Linear and B to zero # lora_A should be normal and lora_B should...
I am not able to download the LoRA adabters for the NLU task this week, is there any other place I can find them?
"We use a random Gaussian initialization for A and zero for B,” in paper but: ` def reset_parameters(self): nn.Embedding.reset_parameters(self) if hasattr(self, 'lora_A'): # initialize A the same way as the...
fixed typo in readme
Does anyone have reproduce the LoRA result of roberta-base? I found the reproduction result of LoRA cannot achieve the result that paper has claimed. e.g.: Paper claimed that the RTE...
Hi, I was able to reproduce the GLUE benchmark results but not the NLG task. For NLG tasks, I downloaded the checkpoint for GPT2-M and follow the step 2,3,4 in...