Sebastian Raschka comments

Results 821 comments of


                                            Sebastian Raschka

Add OLMo: 1B & 7B

If this PR gets revived some time, we should check out the `qkv_reassemble` function from #1341

Add OLMo: 1B & 7B

Wow thanks for resurrecting it and pushing it forward!

Auto precision

That's fair, we would have to run the script with both fp16 and bf16. But this is not that different from saying "if your GPU does not support `--precision bf16-true`...

Compatible with local 8xH100 instead of cloud?

If you have all the dependencies installed, that should be supported. You can check out the [tutorials/pretrain_tinyllama.md](https://github.com/Lightning-AI/litgpt/blob/main/tutorials/pretrain_tinyllama.md) tutorial in this repo. Let us know what results you get, I'd be...

Python 3.12

Nice, I think in that case we can stay tuned. I wish there was a "snooze" option to hide an issue for like a few months and then get reminded...

Sample packing for pretraining/fine-tuning

That's a good question. We don't have a benchmark but LitGPT already supports FlashAttention-2 via PyTorch's SDPA. The plan is to also support FlashAttention-3 (#1578)

Wrong epoch number on last line

Good question. The number of iterations depends on the batch size. I.e., one epoch means one full pass over the dataset. If you have a smaller batch size this will...

[feat] support 01-ai Yi-6B/34B

Hey just pinging to see if you are still interested in pursuing this PR. Personally, I think it'd be awesome to support the YI models in LitGPT. There have been...

Add MPS configs

Thanks for the note. I am not sure if we ever supported MPS devices for pretraining. We can take a look some time, but I don't have a timeline for...

Add MPS configs

Yes, it should work on CPU devices