Sebastian Raschka
Sebastian Raschka
Works great!
Based on the config file run, the train and val loss look great. It's a surprisingly low MMLU though. There's nothing wrong with the finetuned model though and it works...
Awesome, this is great! Thanks for this amazing PR!
Thanks for the note. I just tried it and it both works for me. But yes, you could use `.data` instead. I.e., instead of mnist_test_dataset[i][0][0, :, :] you could use...
I added this as an alternative code line to Ch 14 in case others have the same issue.
Good point. Does the LitData section here help? https://github.com/Lightning-AI/litdata?tab=readme-ov-file#1-prepare-your-data
Personally, I use the `TextFiles` approach that I've implemented in LitGPT. But going back to an earlier comment you had, (and the phrase in the docs), my colleagues don't recommend...
Thanks for updating the masking. I just added some tests to make the equivalent easier... it looks like the updated masking now creates a mismatch between the base model and...
I may have to rethink this when my brain is a bit fresher tomorrow morning, but I think the original code is correct because we don't recompute the older tokens,...
Thanks for the suggestion. I think it's a good idea here to make the code more explicit.