composable_kernel icon indicating copy to clipboard operation
composable_kernel copied to clipboard

int4 inverse quantization and gemm on existing templates

Open kahakuka opened this issue 1 year ago • 4 comments

Is it difficult for me to achieve the fusion of int4 inverse quantization and gem using the existing template? What suggestions do you have?

kahakuka avatar Mar 15 '24 08:03 kahakuka

HIP does not support sub-byte data types. Are you using int4x2?

zjing14 avatar Apr 07 '24 22:04 zjing14

@zjing14 Thank you for your answer.Not use int4x2.It quantifies fp16 into int4 according to a certain pattern and stores it in a uint32 type.The paper introduces it this way. image

kahakuka avatar Apr 08 '24 00:04 kahakuka

@zjing14 Is it easy to implement the integration of int4+gemm on composable_kernel by referring to the method of mma in llm-awq?

llm-awq:The processing of int4 dequation can refer to this file. https://github.com/mit-han-lab/llm-awq/blob/main/awq/kernels/csrc/quantization_new/dequantize.cuh

kahakuka avatar Apr 10 '24 03:04 kahakuka

@xiabo123 Can you provide more context about your request? what do you intend to do for your project? If you have code snippet to share that would be helpful.

huanrwan-amd avatar Sep 27 '24 19:09 huanrwan-amd