tutu329

Results 4 comments of tutu329

reinstall the webui today same issue no this problem before

是清华源的问题,必须要换,比如阿里源

> I made a few updates and moved it to the [default branch](https://github.com/chu-tianxiang/vllm-gptq). Quantized embedding layers and output layers are added, as well as the QxW8 kernels. However the performance...

> ### Anything you want to discuss about vllm. > I want to deploy qwen-1.5-32B,but there is a problem:Total number of attention heads (40) must be divisible by tensor parallel...