nantenT

Results 2 comments of nantenT

Thank you so much for taking the time to respond! I truly appreciate your insights. Regarding point 2, I wanted to seek further clarification: does the process involve performing SFT...

The current Reward Loop implementation appears to launch multiple workers for parallel reward computation, but I'm using an external Generate Reward Model service with limited concurrency capacity. I cannot find...