infinity icon indicating copy to clipboard operation
infinity copied to clipboard

AWQ-Bert / 4-bit Bert

Open michaelfeil opened this issue 1 year ago • 2 comments
trafficstars

Hoping to add a implementation of 4bit Bert, potentially in https://github.com/casper-hansen/AutoAWQ/pull/328. Contributions welcome

michaelfeil avatar Feb 10 '24 01:02 michaelfeil

Hi @michaelfeil, any chance you will look more closely into quantizing BERT models with AWQ? Your PR was off to a great start, but needs more experimentation to figure out how to scale a BERT model.

casper-hansen avatar Jun 21 '24 19:06 casper-hansen

@casper-hansen open for collaboration, but no further progress unfortunately.

michaelfeil avatar Jun 24 '24 17:06 michaelfeil