LLM-VM icon indicating copy to clipboard operation
LLM-VM copied to clipboard

Implement 4bit, 8bit quantization for Nvidia GPUs

Open VictorOdede opened this issue 2 years ago • 6 comments

Can be done with GPTQ

VictorOdede avatar Sep 04 '23 15:09 VictorOdede

@VictorOdede What do you think sort of time commitment this is?

mmirman avatar Sep 11 '23 16:09 mmirman

If this isn't done with a library its a $200 ticket, if so its a SWAG ticket

mmirman avatar Sep 11 '23 16:09 mmirman

If this isn't done with a library its a $200 ticket, if so its a SWAG ticket

This can be done using bitsandbytes library

VictorOdede avatar Sep 11 '23 16:09 VictorOdede

@VictorOdede What do you think sort of time commitment this is?

A few hours max

VictorOdede avatar Sep 11 '23 18:09 VictorOdede

@VictorOdede Is this issue resolved yet?

bilal-aamer avatar Sep 20 '23 11:09 bilal-aamer

Hey @bilal-aamer. This has already been implemented with bitsandbytes/gptq. Just doing some tests before merging the PR.

VictorOdede avatar Sep 20 '23 12:09 VictorOdede