zhong zhuang
Results
1
issues of
zhong zhuang
EETQ is int8 per-channel weight only quantization owned by Netease Fuxi AI Lab. The high performance gemm kernels are derived from FasterTransformer and TensorRT-LLM. We fit it into vllm and...