GRPO results not match the dataset

Open calvin2021y opened this issue 1 year ago • 2 comments

I use the Gemma 3 (1B), pick some question from the openai/gsm8k, the result is pretty bad compare Gemma 3 (1B) without fine-tune.

I run it with llama-cli -m ./gemma-3-finetune.Q8_0.gguf, is there something I do wrong ?

I also try change max_step=300, save_step=250. max_seq_length=4096

Apr 09 '25 05:04 calvin2021y

Hello are you still having your problem? By the way you can join our discord at discord.gg/unsloth where many people can help you there :D

May 24 '25 16:05 Erland366

Thanks for the tips, I will try again and report back late.

May 26 '25 05:05 calvin2021y