QuaRot
QuaRot copied to clipboard
questions about the rotate
Thanks for the wonderful work, however, i have some problem with the code.
I've encountered a problem with the code implementation as described in the Introduction of your paper. The rotation process, as stated, should not alter the model. However, when I used the provided code, I obtained results that contradict this claim.
After removing the quantization part and retaining only the rotation function, I tested the model on the wikitext dataset and obtained significantly degraded performance. Also I check the model's output like others' method in other issues like this:
The model began to generate nonsensical words. This conclusion has also been confirmed by others, as you can see in the issues section of your repository. Could you please explain this discrepancy?