Daniel Han

Results 781 comments of Daniel Han

Thanks @shaon-chowdhury for debugging and helping! Appreciate it :) Sorry sadly I'm not an expert on ONNX, so can't be of much help :(

@rwl4 Currrently we support phi-3 mini via https://colab.research.google.com/drive/1NvkBmkHfucGO3Ve9s1NKZvMNlw5p83ym?usp=sharing and https://huggingface.co/unsloth/Phi-3-mini-4k-instruct-bnb-4bit

@rwl4 @JackCloudman @joshib123 We support Phi-3 Medium and Mini now! See https://github.com/unslothai/unsloth/releases/tag/May-2024 (also includes Colabs) Small is still in the works! Please update Unsloth for local machines. For Colab or...

@joshib123 I don't think there's a bug - that probably means ur learning rate is too high

@anakin87 No sorry - Small is a vastly different architecture :(

Sadly norm will need gradients for the layernorms, which are horrifying to write up in Triton

@RonanKMcGovern Oh it can be done! It's not a normal thing to do, but it can be enabled - hmmm

Oh if norms and embed_tokens and every thing is enabled, that's literally full finetuning, except the weight updates are low rank :)) The layernorm's gradients are just way too tedious...

If you turn on training the lm_head, then it might overfit, which is normal - I normally suggest just leaving it out