Carlos Mocholí

Results 90 issues of Carlos Mocholí

Meta AI has since released LLaMA 2. Additionally, new Apache 2.0 licensed weights are being released as part of the [Open LLaMA project](https://github.com/openlm-research/open_llama). ### To run LLaMA 2 weights, Open...

Proposed in https://github.com/Lightning-AI/lit-llama/pull/255 The only difference in logic is the instruction tuning. We could add a flag for it as in https://github.com/Lightning-AI/lit-llama/pull/278

good first issue
inference

Our current LoRA implementations applies it to just the qv computation. However, recent trends suggest there are performance improvements to gain from applying it elsewhere. For instance, the [QLoRA](https://arxiv.org/pdf/2305.14314.pdf) paper...

Port https://github.com/Lightning-AI/lit-parrot/pull/94 and https://github.com/Lightning-AI/lit-parrot/pull/98 to this repository

enhancement
fine-tuning

PEFT finetuning (LoRA, adapter) raises the following warning for each FSDP-wrapped layer (transformer block in our case): ```python The following parameters have requires_grad=True: ['transformer.h.0.attn.attn.lora_A', 'transformer.h.0.attn.attn.lora_B'] The following parameters have requires_grad=False:...

fine-tuning

Requires https://github.com/Lightning-AI/lightning-thunder/pull/359

If `--train.max_steps` is equal to `--train.lr_warmup_steps` then the `T_max` will result in a division by 0 https://github.com/Lightning-AI/litgpt/blob/6fd737d3da240a67f4acb7a3ce733fa2e67538a4/litgpt/finetune/lora.py#L385 ```python [rank0]: Traceback (most recent call last): [rank0]: File "/home/carlos/nightly-env/bin/litgpt", line 8, in...

bug
help wanted

### Is there an existing issue that is already proposing this? - [X] I have searched the existing issues ### Is your feature request related to a problem? Please describe...

## 🐛 Bug In my code, I am enabling a `tqdm` bar per worker with: ```python global_rank = int(os.environ["DATA_OPTIMIZER_GLOBAL_RANK"]) num_workers = int(os.environ["DATA_OPTIMIZER_NUM_WORKERS"]) local_rank = global_rank % num_workers for example in...

bug
help wanted
won't fix

## 🐛 Bug ### To Reproduce Code: ```python import os import torch import torch.distributed as tdist import thunder from thunder.tests.litgpt_model import GPT, Config if __name__ == "__main__": tdist.init_process_group(backend="nccl") LOCAL_RANK =...

bug
distributed