RobertLiu0905

Results 3 comments of RobertLiu0905

I grabbed the flame chart, and the problem was gptq.py#apply_weights#ops.gptq_gemm。May be beyond the capacity of cuda computing。 ![image](https://github.com/vllm-project/vllm/assets/10494702/7316fd86-7c0b-4b20-bb87-36bc13875783) ![image](https://github.com/vllm-project/vllm/assets/10494702/fbb7ced7-f930-4683-9787-0fd505413384)

I also encountered this problem. I solved it by compiling the [nccl source code](https://github.com/NVIDIA/nccl) and then modifying the path of libnccl.so.2 in the vllm source code.