Carlos Mocholí
Carlos Mocholí
## 🚀 Feature We have released https://github.com/Lightning-AI/utilities/ which contains multiple utilities that are shared across the codebase. ### Motivation Avoid protected imports Avoid duplicated code ### Pitch Go through the...
Replaces DeepSpeed with FSDP Requires https://github.com/Lightning-AI/lightning/pull/17845 Closes #116 Closes #177 Closes #169 Falcon 7b takes 32 GB max memory allocated using 2 devices and 32-true or bf16-mixed precision. Loss is...
- [ ] full #117 - [x] LoRA #128 - [x] Adapter #31
One of the CUDA tests is failing: `pytest tests/test_model.py::test_bfloat16_llama_init` ```python E RuntimeError: Expected query, key, and value to have the same dtype, but got query.dtype: float key.dtype: float and value.dtype:...
- Implements padding support directly in the model. - Adds a version of the model fully integrated with TransformerEngine - Adds a guide for training with TransoformerEngine (WIP)
Part of https://github.com/Lightning-AI/lit-parrot/pull/123
Strict loading is useful as it enforces that your checkpoint gets loaded as you expect. In the case of the fine-tuned checkpoints, we can merge them to the pre-trained one...
Fixes https://github.com/Lightning-AI/lit-parrot/pull/127#issuecomment-1587732494
We support a large number of checkpoints. And there's a multitude of scripts that can be run. Users often ask questions like "can I run X script with Y model...
Follow-up to #147