Daniel Han comments

Results 1103 comments of


                                            Daniel Han

trafficstars

[FIXED] Exception: data did not match any variant of untagged enum ModelWrapper at line 1251003 column 3

@tongyx361 Apologies on the delay - ye the new transformers update broke saving - so you need overwrite the old tokenizer file up redownloading them

[FIXED] Exception: data did not match any variant of untagged enum ModelWrapper at line 1251003 column 3

@katopz @srsugandh Can you guys ask this on our Discord - probably a better place to get this resolved

how to Setting Default Output Format for Qwen2.5 Model Similar to DeepSeek-R1

GRPO leverages the system prompt from Qwen itself. So it's better to use: ```python from transformers import AutoTokenizer tokenizer = AutoTokenizer.from_pretrained("___unsloth_qwen_model__") tokenizer.apply_chat_chat([ {"role" : "system", "content" : SYSTEM_PROMPT}, {"role" :...

How much impact does the system prompt have on the output results?

So the system prompt is important yes, but yes the number of steps is way too less - you probably need 500 to 2000

wondering if unsloth can finetune llm on code repo or codebase

It's better to ask this on our Discord!

RecursionError: maximum recursion depth exceeded in unsloth_zoo/compiler.py during Initialization

Just fixed - apologies on the issue! For local machines, please do: ``` pip install "unsloth>=2025.3.8" "unsloth_zoo>=2025.3.7" --upgrade --force-reinstall --no-deps ``` For Colab / Kaggle machines, please disconnect and restart...

RecursionError: maximum recursion depth exceeded in unsloth_zoo/compiler.py during Initialization

Oh GRPO experiments are fine! These bugs are more related to Unsloth internals, and will not affect training runs (ie how I optimize files etc)

Error Kernel crash with GRPO

@Cgrandjean Follow your original script, and try lowering `gpu_memory_utilization` to 0.5 or 0.4. I'm working on reducing VRAM consumption which will come in a few days!

Grpotrainer cannot find the "pad_token_id" when using Qwen2.5-VL-72B-Instruct model

Oh my - will work on a fix!

Installation fails on H100 with flash-attention undefined symbol error

Try uninstalling flash-attn! It's optional!