Sebastian Raschka

Results 820 comments of Sebastian Raschka

Based on the config file run, the train and val loss look great. It's a surprisingly low MMLU though. There's nothing wrong with the finetuned model though and it works...

Awesome, this is great! Thanks for this amazing PR!

Thanks for the note. I just tried it and it both works for me. But yes, you could use `.data` instead. I.e., instead of mnist_test_dataset[i][0][0, :, :] you could use...

I added this as an alternative code line to Ch 14 in case others have the same issue.

Good point. Does the LitData section here help? https://github.com/Lightning-AI/litdata?tab=readme-ov-file#1-prepare-your-data

Personally, I use the `TextFiles` approach that I've implemented in LitGPT. But going back to an earlier comment you had, (and the phrase in the docs), my colleagues don't recommend...

Thanks for updating the masking. I just added some tests to make the equivalent easier... it looks like the updated masking now creates a mismatch between the base model and...

I may have to rethink this when my brain is a bit fresher tomorrow morning, but I think the original code is correct because we don't recompute the older tokens,...

Thanks for the suggestion. I think it's a good idea here to make the code more explicit.