QAT for LLM

Open SingJumpRapp opened this issue 2 months ago • 1 comments

Is there some documents about the QAT code for LLM? I want to do the QAT on llama3 but I can only find the code for mobilenet_v2, I think there is some gaps between different type models, could you provider me some help?

Oct 15 '25 07:10 SingJumpRapp

There seems qat script for llama2-7B. https://github.com/quic/aimet/blob/develop/Examples/torch/quantization/llm_qat_kd/finetune_llm_qat_kd.py

I wonder how much quantization loss decreased with this recipe.

Oct 17 '25 05:10 lifelongeeek