Martyna Patelka
Martyna Patelka
## π Bug When running benchmarking script with `--checkpoint_activations True` we get: > AssertionError: t54580_out for rematerialisation This issue is present for the following models: 'Llama-3-70B', 'Gemma-2-27b', 'longchat-13b-16k', 'Mistral-7B-v0.2', 'vicuna-7b-v1.5-16k',...
## π Bug This might be related to [old OOM issue](https://github.com/Lightning-AI/lightning-thunder/issues/474), but the models and # nodes is different, so I decided to create another one. We get OOM error,...
## π Bug When using DDP with Dynamo+Thunder we get: > AttributeError: \'Float8Tensor\' object has no attribute \'_fp8_attrs\' This issue affects the following models: 'dolly-v2-3b', 'Mistral-7B-v0.1', 'tiny-llama-1.1b', 'stablecode-completion-alpha-3b', 'Phi-3-mini-4k-instruct', 'falcon-7b'...
## π Bug When training models: 'vicuna-7b-v1.5-16k', 'longchat-13b-16k', 'Mistral-7B-v0.2', 'falcon-180B', 'Llama-3-70B', 'CodeLlama-34b-hf' with FSDP and FP8 we get KeyError: 'scaling_fwd'. This might be also issue with Transformer Engine,, so I'm...
## π Feature Make Thunder + Mistral-7B-v0.1 as fast as Thunder + Llama3-8b (comparing to Eager mode). ### Motivation Below are data for: * Llama3-8b:  * Mistral-7B-v0.1:  *...
## π Bug As can be seen below Thunder is slower than torch.compile for single gpu training of falcon-7b:  Below are results for ThunderFX for multi-gpu training : ...
## π Bug When running Mistral-7B-v0.1 we get OOM error. The same configuration passes for torch.compile. ### To Reproduce Steps to reproduce the behavior: Please use: 1 node(s), each with...
## π Bug Recently we got OOM errors causing failures of Gemma-2-2b (in canary runs) and distributed training of stablecode-completion-alpha-3b. ### To Reproduce Please use: 1 node(s), each with 8...