Eviltuzki
Eviltuzki
> 这个量级时延可以看到差异。 > > 具体是如何测试的?结果如何? 使用echo-c++的示例进行修改,client和server的option都加了use_rdma=true client循环请求部分代码如下: ``` // Normally, you should not call a Channel directly, but instead construct // a stub Service wrapping it. stub can be shared...
> 有尝试过使用rdma_performance的example进行测试吗 没有,现在借用的环境已经还回去了,主要没什么环境了 从理论上将,是不是rdma延迟会低一点?即便是小字节数据的情况
same problem, try upgrade your hosts kernel . I solved this problem by upgrading the kernel, the system is Centos 7, from 3.10 -> 5.4 .
> Thx @Eviltuzki for your good suggestion! Maybe @cncal it seems there's two possible ways to work around this problem for your cases. > > * upgrade your host kernel...
same problem (OpenAIChatCompletionsClient pid=164504) 422 (OpenAIChatCompletionsClient pid=164507) 422 (OpenAIChatCompletionsClient pid=164510) 422 (OpenAIChatCompletionsClient pid=164506) 422 (OpenAIChatCompletionsClient pid=164500) 422 (OpenAIChatCompletionsClient pid=164502) 422 (OpenAIChatCompletionsClient pid=164498) 422 Traceback (most recent call last): File "/home/llmperf/token_benchmark_ray.py",...
> same problem here, how to fix this? try print the response code and content