composable_kernel int4 inverse quantization and gemm on existing templates

Is it difficult for me to achieve the fusion of int4 inverse quantization and gem using the existing template? What suggestions do you have?

Mar 15 '24 08:03 kahakuka

HIP does not support sub-byte data types. Are you using int4x2?

Apr 07 '24 22:04 zjing14

@zjing14 Thank you for your answer.Not use int4x2.It quantifies fp16 into int4 according to a certain pattern and stores it in a uint32 type.The paper introduces it this way.

Apr 08 '24 00:04 kahakuka

@zjing14 Is it easy to implement the integration of int4+gemm on composable_kernel by referring to the method of mma in llm-awq?

llm-awq：The processing of int4 dequation can refer to this file. https://github.com/mit-han-lab/llm-awq/blob/main/awq/kernels/csrc/quantization_new/dequantize.cuh

Apr 10 '24 03:04 kahakuka

@xiabo123 Can you provide more context about your request? what do you intend to do for your project? If you have code snippet to share that would be helpful.

Sep 27 '24 19:09 huanrwan-amd