Daniel Han comments

Results 781 comments of


                                            Daniel Han

Qwen/Qwen1.5-14B-Chat not supported yet!

@mouhsineguet You can try lllamfying it via llama-factory also try searching on HF models for `qwen 14b llama` and there might be some llamafied versions

Slow Kaggle Performance (2x T4)

@JIBSIL What's the generation speed without Unsloth in Kaggle? Also why 150/? Shouldn't it be len(output)/?

Slow Kaggle Performance (2x T4)

@JIBSIL Whoops my bad - I think I fixed it now!

Slow Kaggle Performance (2x T4)

@HirCoir Apologies sorry was extremely busy this week so didn't have time to look at this! I'll see what I can do! @JIBSIL Also sorry did not respond until now!...

Slow Kaggle Performance (2x T4)

@JIBSIL Fixed batched inference yesterday (after your comment!!) See https://github.com/unslothai/unsloth/issues/267#issuecomment-2034047189 for more info. You'll need to update Unsloth without any dependency updates via `pip install --upgrade --force-reinstall --no-cache-dir git+https://github.com/unslothai/unsloth.git` for...

Slow Kaggle Performance (2x T4)

@JIBSIL Hmmm could be im not freeing something - let me check

Slow Kaggle Performance (2x T4)

@JIBSIL Much apologies on the delay - its possible there are small memory fragmentations over time, which will cause OOMs, but ye its possibly cause ur doing 7B and not...

Fix no-op assert message in chat_templates.py

Oh I think I did a typo whoops - let me first check why I did it lol

Fix no-op assert message in chat_templates.py

@pdurasie Sorry whoops long time on the PR - I did an overhaul, and I actually allow you to use `map_eos_token = False` now :)

Loss and Grad Norm discrepancy during full finetuning

@thedarkzeno Oh wait full finetuning - did you make all layers (Q, K, V, O, gate, up, down) + layernorms + lm_head, embeddings all trainable? I was gonna say I...