Grounding_LLMs_with_online_RL
Grounding_LLMs_with_online_RL copied to clipboard
Can't run using lamorel
I tried running your code using the following procedure:
https://github.com/flowersteam/lamorel/tree/main/examples/PPO_finetuning
But I can't succeed. Would be really grateful if you could let me know how to reproduce your results.
And again, awesome work, it was really a good read.