gushiqiao
gushiqiao
@helloyongyang
@helloyongyang
Is it caused by prompt? https://github.com/ModelTC/llmc/blob/bc9367fb8088e9040cc3d20c8ce7e44c32d95e8c/llmc/eval/eval_code.py#L20C8-L20C9
https://github.com/ModelTC/llmc/blob/b0bf39e96a0ce44f74ec9a42729c09f6cd6f893e/configs/quantization/methods/MixPrecision/rtn_w_a_static.yml#L37
This setting is deployment-friendly. The previous code structure was somewhat messy, so for now we've opted for a simplified support of 8-bit and 16-bit mixed precision. In theory, all methods...
https://github.com/ModelTC/llmc/blob/b0bf39e96a0ce44f74ec9a42729c09f6cd6f893e/configs/quantization/methods/MixPrecision/rtn_w_a_static.yml#L37
Thank you for sharing your proposed solution. I believe it is well thought out and valuable. Please feel free to submit a pull request, and I will be happy to...
https://github.com/ModelTC/llmc/blob/50b0da743e90c28b9df4360dbac74d93c6a1e504/llmc/compression/quantization/module_utils.py#L475
Since Spinquant requires training, it may involve significant changes to the code structure. Therefore, we do not plan to merge it into the main branch in the near term. Thank...
I’ll try to allocate some time to support loading pre-trained rotation matrices directly as a feature.