Daniel Han
                                            Daniel Han
                                        
                                    @wsp317 I fixed it just then! Sorry on the delay! I If you're on a local machine, please update Unsloth via ``` pip uninstall unsloth -y pip install --upgrade --force-reinstall...
Sadly currently full finetuning isnt yet supported - some Unsloth community members have tried doing it, and it does converge, albeit the layernorms are not trained
Apologies this slipped by me! Extreme sorry! Ye unfortunately Windows is a bit of an issue to support - (due to Triton). See https://github.com/unslothai/unsloth/issues/210 which might be helpful
Very cool @Jiar !! Will check that out!
Yes too long contexts will cause OOMs. According to our blog: https://unsloth.ai/blog/llama3, the max context length on Tesla T4s (16GB) is 10K ish
You need to change `merged_4bit_forced` to `merged_16bit`
Ye AWQ is nice :) We might be adding a AWQ option for exporting!
@subhamiitk Use `model.save_pretrained_merged("location", tokenizer, save_method = "merged_16bit",)` then use vLLM
So sorry on the delay - just relocated to SF - exporting to AWQ is for now on the roadmap - directly finetuning AWQ could work as well, but will...
I'll see what I can do!