unsloth
unsloth copied to clipboard
How to make sure GRPOTrainer is patched when I subclass it
In my original script, I subclass the GRPOTrainer for my usecase. I also have an input arg that controls when I want to use UnSloth. I would like to
- call PatchFastRL optionally depending on an input arg
- Make sure when arg is true, the subclassing is done with the patched GRPOTrainer
Right now, since I am not calling PatchFastRL at the beginning of the program, the subclassing is done with the original GRPOTrainer instead.