Carlos Mocholí comments

Results 427 comments of


                                            Carlos Mocholí

2x slower training speed with FSDP when switching from lightning 1.9 to 2.0

Also, I suggest that you check out https://github.com/Lightning-AI/lit-gpt for all your gpt training needs. The one you linked is an earlier version that hasn't been updated

Port finetuning from Lit-LLaMA

@joseph35533553 We have adapter merged (https://github.com/Lightning-AI/lit-parrot/blob/main/howto/finetune_adapter.md), and Lora on its way: #128

Issue when training a MOE model

Thanks for the report. Unfortunately, PyTorch doesn't support this, so we cannot measure the flops used by Mixtral the way we do. For the moment, you can avoid the error...

Add OLMo: 1B & 7B

Hey! All your suggestions make sense to me. You should be able to split the combined ff linear as you suggest, especially if load_param has ben called already. We also...

Add OLMo: 1B & 7B

I would strongly prefer that we don't add this new MLP class. To debug the output, you'll have to inspect the activations for both models layer by layer to see...

NotImplementedError: max_seq_length 264 needs to be >= 857

Hi! I don't think this should happen. Can you share the exact command that you ran, the complete error stacktrace, and any changes you made to the repository?

[TPU] XLA changes for finetuning

Sorry, it was an accident!

Query Regarding Minimum Hardware Requirements for Fine-tuning and Inference

Quantization support landed with #104 so now you can do inference with less requirements.

Issue with logs when using torch.compile

For the moment, and as a workaround, I would strongly suggest that you simply `torch.compile` the underlying `nn.Module` instead of the `LightningModule`

Issue with logs when using torch.compile

Here's a repro ```python import os import torch from lightning.pytorch import LightningModule, Trainer from torch.utils.data import DataLoader, Dataset from lightning.pytorch.demos.boring_classes import RandomDataset class BoringModel(LightningModule): def __init__(self): super().__init__() self.layer = torch.nn.Linear(32,...