llmperf
llmperf copied to clipboard
fix: subsequent requests cannot be sent until 'num_concurrent_requests' requests have all finished in non-block mode
issues
https://github.com/ray-project/llmperf/issues/43 https://github.com/ray-project/llmperf/issues/56
Summary
- Subsequent requests cannot be sent until whole requests have all finished even in non-block mode.
- Fixing the request launcher was challenging due to its dependency on Ray, so I used multiple threads and request launchers, each holding one client and controlling only one request.