Sebastian Raschka comments

Results 818 comments of


                                            Sebastian Raschka

How to set max_iters

Oh we can keep it open actually, I think it would be a nice thing to add some day. Thanks for raising that!

How to set max_iters

I'd say we ideally need to add it to the pretrain code (https://github.com/Lightning-AI/litgpt/blob/main/litgpt/pretrain.py) so that it can be used in general with all datasets.

[BUG] LLaMA 3.1 RoPE

Thanks for reporting this. There are currently a few other issues on my list, but I hope to be able to address this some time.

Default top_k value (or something else) causes nonsensical outputs

@awaelchli Thanks! I am fairly certain now that it was an incomplete kv-cache clearing (#1596)

Default top_k value (or something else) causes nonsensical outputs

We addressed this in #1596

QA-LoRA: Quantization Aware Low-Rank Adaptation

Just read the paper and came here to suggest it and see you were already faster 😊 What I currently don't understand is why they need to shrink the number...

chatting with mistral generates answer with no spaces

Thanks for flagging this. I know Mistral is using their own tokenizer, but I could swear this worked before. Something to look into some time.

Gradient Accumulation Step under Multi-node Pretaining

Good question, intuitively, I'd say that's a good point. @awaelchli what are your thoughts here? I think you have some experience running pretraining on multi-node.

TypeError: got an unexpected keyword argument 'fit_params'

Oh yeah that would be a good idea. I think it might require some other adjustments in other places then as well. It is on my backlog but not sure...

TypeError: got an unexpected keyword argument 'fit_params'

I completely missed this earlier. Many thanks for the fix @d-kleine ! @Arminius4 , after the recent merge, this should now also work via the main branch: ```bash pip install...