Mathieu Chartier
Mathieu Chartier
In your example, there's a number that shocks me, and I can't explain it. If you look at the notebook on the CPT with Mistral-7b provided by Unsloth, r=128 and...
Unfortunately it doesn't work! :-( I saw that when I run "unsloth/phi-4-unsloth-bnb-4bit", Unsloth says "Unsloth 2025.3.19: Fast Llama patching", but when it's "unsloth/Phi-4-mini-instruct-bnb-4bit", it says "Unsloth 2025.3.19: Fast Phi3 patching"....
Thanks, I will try with a new env (even if my actual env is recent).
I noticed something during my tests. I have a dataset composed of raw texts in a JSONL. When I target only the first 2000 texts, I can do CPT with...
It doesn't work with UNSLOTH_COMPILE_DISABLE, sorry... And, just for your information, the same issue exists for inference too, not only for training.