AutoAWQ
AutoAWQ copied to clipboard
Does AutoAWQ support multi-threading CPU?
Greetings everyone.
- Server configuration: a modern CPU with multiple cores/large memory/a relatively weak GPU with insufficient VRAM(16G);
- In such case, it is impossible to use the GPU to quantize the model due to low VRAM;
- So is it possible to leverage the multiple cores of CPU?
- I have tried to set the device_map="cpu", the work has been bound to CPU, but with ONLY ONE core. So it runs really slowly.
Can anyone provide any idea/hint to solve this issue? Thank you all :)