Carlos Mocholí
Carlos Mocholí
Meta AI has since released LLaMA 2. Additionally, new Apache 2.0 licensed weights are being released as part of the [Open LLaMA project](https://github.com/openlm-research/open_llama). ### To run LLaMA 2 weights, Open...
Proposed in https://github.com/Lightning-AI/lit-llama/pull/255 The only difference in logic is the instruction tuning. We could add a flag for it as in https://github.com/Lightning-AI/lit-llama/pull/278
Our current LoRA implementations applies it to just the qv computation. However, recent trends suggest there are performance improvements to gain from applying it elsewhere. For instance, the [QLoRA](https://arxiv.org/pdf/2305.14314.pdf) paper...
Port https://github.com/Lightning-AI/lit-parrot/pull/94 and https://github.com/Lightning-AI/lit-parrot/pull/98 to this repository
PEFT finetuning (LoRA, adapter) raises the following warning for each FSDP-wrapped layer (transformer block in our case): ```python The following parameters have requires_grad=True: ['transformer.h.0.attn.attn.lora_A', 'transformer.h.0.attn.attn.lora_B'] The following parameters have requires_grad=False:...
Requires https://github.com/Lightning-AI/lightning-thunder/pull/359
If `--train.max_steps` is equal to `--train.lr_warmup_steps` then the `T_max` will result in a division by 0 https://github.com/Lightning-AI/litgpt/blob/6fd737d3da240a67f4acb7a3ce733fa2e67538a4/litgpt/finetune/lora.py#L385 ```python [rank0]: Traceback (most recent call last): [rank0]: File "/home/carlos/nightly-env/bin/litgpt", line 8, in...
### Is there an existing issue that is already proposing this? - [X] I have searched the existing issues ### Is your feature request related to a problem? Please describe...
## 🐛 Bug In my code, I am enabling a `tqdm` bar per worker with: ```python global_rank = int(os.environ["DATA_OPTIMIZER_GLOBAL_RANK"]) num_workers = int(os.environ["DATA_OPTIMIZER_NUM_WORKERS"]) local_rank = global_rank % num_workers for example in...
## 🐛 Bug ### To Reproduce Code: ```python import os import torch import torch.distributed as tdist import thunder from thunder.tests.litgpt_model import GPT, Config if __name__ == "__main__": tdist.init_process_group(backend="nccl") LOCAL_RANK =...