aimet
aimet copied to clipboard
QAT for LLM
Is there some documents about the QAT code for LLM? I want to do the QAT on llama3 but I can only find the code for mobilenet_v2, I think there is some gaps between different type models, could you provider me some help?
There seems qat script for llama2-7B. https://github.com/quic/aimet/blob/develop/Examples/torch/quantization/llm_qat_kd/finetune_llm_qat_kd.py
I wonder how much quantization loss decreased with this recipe.