AutoAWQ
AutoAWQ copied to clipboard
How to use multiple GPU nodes during quantization
When I am converting my Qwen2VL-72B model, I want to use multiple GPU nodes to utilize more data. How to achieve this
Hi @ghntd, at the moment, data parallelism is not implemented. I welcome any help on implementing this that demonstrates a speedup.
@casper-hansen any update on this?