verl [Question] How to call reward model in rule based reward function?

[Question] How to call reward model in rule based reward function?

Open zsychina opened this issue 10 hours ago • 1 comments

Could it be possible for me to call a generative model, e.g. locally deployed model or remotely called API model in reward functions?

I want to use Qwen model to generate a reward after ####, what is the best practice to do it? I note that directly call model in reward function may face severe efficency problem.

Thank you, any replies will be helpful!

Mar 01 '25 07:03 zsychina

verl verl copied to clipboard

[Question] How to call reward model in rule based reward function?

verl
verl copied to clipboard