flatLying
Results
2
comments of
flatLying
Hello, is there any progress about providing a naive python code (loading local model) agentic rl example? Looking forward to this example so much!
I mean an complete simple example without any framework support,just like how TRL document show: https://huggingface.co/docs/trl/main/grpo_trainer ```python # train_grpo.py from datasets import load_dataset from trl import GRPOConfig, GRPOTrainer dataset =...