Eviltuzki comments

Results 6 comments of


                                            Eviltuzki

咨询一下关于RDMA延迟问题

> 这个量级时延可以看到差异。 > > 具体是如何测试的？结果如何？使用echo-c++的示例进行修改，client和server的option都加了use_rdma=true client循环请求部分代码如下： ``` // Normally, you should not call a Channel directly, but instead construct // a stub Service wrapping it. stub can be shared...

咨询一下关于RDMA延迟问题

> 有尝试过使用rdma_performance的example进行测试吗没有，现在借用的环境已经还回去了，主要没什么环境了从理论上将，是不是rdma延迟会低一点？即便是小字节数据的情况

ctr run will hung with runtime kata

same problem, try upgrade your hosts kernel . I solved this problem by upgrading the kernel, the system is Centos 7, from 3.10 -> 5.4 .

ctr run will hung with runtime kata

> Thx @Eviltuzki for your good suggestion! Maybe @cncal it seems there's two possible ways to work around this problem for your cases. > > * upgrade your host kernel...

Divide by zero: request_metrics[common_metrics.REQ_OUTPUT_THROUGHPUT] = num_output_tokens / request_metrics[common_metrics.E2E_LAT]

same problem (OpenAIChatCompletionsClient pid=164504) 422 (OpenAIChatCompletionsClient pid=164507) 422 (OpenAIChatCompletionsClient pid=164510) 422 (OpenAIChatCompletionsClient pid=164506) 422 (OpenAIChatCompletionsClient pid=164500) 422 (OpenAIChatCompletionsClient pid=164502) 422 (OpenAIChatCompletionsClient pid=164498) 422 Traceback (most recent call last): File "/home/llmperf/token_benchmark_ray.py",...

Divide by zero: request_metrics[common_metrics.REQ_OUTPUT_THROUGHPUT] = num_output_tokens / request_metrics[common_metrics.E2E_LAT]

> same problem here, how to fix this? try print the response code and content