[WIP] Add support for quantization with bitsandbytes

Open mryab opened this issue 3 years ago • 2 comments

This PR integrates blockwise quantization from bitsandbytes as a new compression mechanism of Hivemind. The important part is that it is an optional compression protocol: the user should only install an external library if they are going to need it, and hence the "conditional import"/"extra dependency" parts.

The code on the Hivemind side is pretty simple, but it'd be cool to have a way to include a CPU-only build of bitsandbytes as a dependency, so that we'll be able to both include it without checking for a CUDA version and to test the integration in GHA. @TimDettmers has granted me access to the bitsandbytes repo, so I'm going to work on that first before making this PR as ready to merge.

Jun 27 '22 16:06 mryab

How's the PR going? need any help?

Jul 14 '22 15:07 justheuristic

Not sure if I need any help, since we're mostly waiting for the new bitsandbytes release

Jul 14 '22 15:07 mryab

Codecov Report

Merging #490 (f311943) into master (6395e89) will decrease coverage by 0.04%. The diff coverage is 89.74%.

@@            Coverage Diff             @@
##           master     #490      +/-   ##
==========================================
- Coverage   86.31%   86.27%   -0.05%     
==========================================
  Files          81       81              
  Lines        7887     7919      +32     
==========================================
+ Hits         6808     6832      +24     
- Misses       1079     1087       +8

Impacted Files	Coverage Δ
hivemind/compression/quantization.py	`94.59% <87.50%> (-2.88%)`	:arrow_down:
hivemind/compression/__init__.py	`100.00% <100.00%> (ø)`
hivemind/compression/serialization.py	`100.00% <100.00%> (ø)`
hivemind/averaging/matchmaking.py	`88.35% <0.00%> (-0.90%)`	:arrow_down:
hivemind/averaging/averager.py	`88.27% <0.00%> (-0.24%)`	:arrow_down:

Aug 22 '22 23:08 codecov[bot]

LGTM, please merge at will

Sep 09 '22 10:09 justheuristic

hivemind hivemind copied to clipboard

[WIP] Add support for quantization with bitsandbytes

Codecov Report

hivemind
hivemind copied to clipboard