verl
verl copied to clipboard
server base: ChatScheduler send out requests to workers in loadbalance
Add a queue to balance workload to benefit from server based rollout
related to https://github.com/volcengine/verl/issues/658
We require rate limiting, optional back press functionality, and monitoring for active requests here.