AutoAWQ
AutoAWQ copied to clipboard
Multi-Node Quantization using Ray?
Hi,
in theory I could get enough compute to host and quantize current models.
But it will be provided as multiple VMs, each with 2GPUs, each with 48GB VRAM. Using these, I could create a Ray Cluster https://github.com/ray-project/ray with, e.g., 5 nodes, therefore in total 10 GPUs and 480 GB VRAM.
Is it possible to utilize this to quantize models with AutoAWQ?
Thank you very much! Best regards
Hi @paolovic, at the moment this is not something explicitly supported or even something that I have attempted. I suspect it could be possible, but it's not something that I have researched. If you do find the time, I would love to receive a PR with either code changes / docs on how to do this.
Hoi @casper-hansen , alright, as soon as I have time for that, I will dig into it. First, I assume I'll have to master Ray. Thank you for the quick response!