Juntongkuki
Results
2
comments of
Juntongkuki
> update: removing `quantization="AWQ"` (per [this link](https://github.com/vllm-project/vllm/issues/6985)) seems to speed it up, but still slower than FP16. I have the same problem.
> [@it-dainb](https://github.com/it-dainb) Did you resolve this error? I also have the same problem and the loss is too big..... INFO ------------------------------------------------------------------------------------------------------------------------------------------------------ INFO | process | layer | module | loss...