Sebastian Raschka

Results 821 comments of Sebastian Raschka

> I wanted to fine-tuning full parameters of gemma model, I noticed there is an example, https://github.com/Lightning-AI/litgpt/blob/main/litgpt/finetune/full.py , can I use this example for domain specific fine-tuning? Yes, this model...

Sorry, for the late responses, but it's been a busy week. Regarding the domain-specific finetuning, it would be similar to continued pretraining without instructions in your case, correct? Regarding the...

Well, maybe ignore the last commit. Lazy loading works now when you use 1 device, but it now fails when using multiple devices and deepspeed. The previous commit without lazy...

Thanks for bringing that up! I think `reset_parameters()` will not make the weights 0 though but reinitialize them when I understand correctly. So I think this should be okay but...

Thanks for the PR, I appreciate it! However, I think in this notebook importing torch is not necessary as we are working with `import torch.nn as nn`. Did you bump...

Ah yes good call. I see that I imported it here:

Thanks again for moving the `import torch` line up. I just removed the 2nd import to reduce redundancy.

Thanks for bringing this up! Regarding removing the `:num_tokens` slicing from ```python self.mask.bool()[:num_tokens, :num_tokens] ``` That's unfortunately not possible like @ahmedDaoudi-u mentioned. E.g., in Ch05, we are using an LLM...

Good eye for detail. Actually the +1 wasn't necessary so I updated that a while back in the notebook and manuscript. I think you are seeing the old +1 in...

Ah yes, big thanks for the follow up! I think I may have missed one. I probably did a find+replace looking for `stride=max_length+1` and then missed the one you had...