Roman Makarov
Results
2
comments of
Roman Makarov
Well, changing the version of transformers does not help. What do you mean with GPTQModel? Use their generation benchmark? Quantize model with their method and try it on AutoGPTQ generation...
I have the same issue with llamas and phi, even though I follow their instructions from [here](https://github.com/VainF/Torch-Pruning/tree/master/examples/LLMs). Am I the only one to encounter that?