zhong zhuang

Results 1 issues of zhong zhuang

EETQ is int8 per-channel weight only quantization owned by Netease Fuxi AI Lab. The high performance gemm kernels are derived from FasterTransformer and TensorRT-LLM. We fit it into vllm and...