Juntongkuki

Results 2 comments of Juntongkuki

> update: removing `quantization="AWQ"` (per [this link](https://github.com/vllm-project/vllm/issues/6985)) seems to speed it up, but still slower than FP16. I have the same problem.

> [@it-dainb](https://github.com/it-dainb) Did you resolve this error? I also have the same problem and the loss is too big..... INFO ------------------------------------------------------------------------------------------------------------------------------------------------------ INFO | process | layer | module | loss...