EETQ
EETQ copied to clipboard
Easy and Efficient Quantization for Transformers
Results
4
EETQ issues
Sort by
recently updated
recently updated
newest added
Hi is there a way to run EETQ without accelerator ? at least for the quantization process thanks
The training process is quite slow, whereas using 8-bit hqq speeds it up by more than tenfold. Is this normal? Or have I missed any code? ```python import torch from...
Using TGI or Lorax eetq quantization takes several minutes (Eg 10 minutes for Mixtral) every time the launcher is run . As a reference bitsandbytes nf4 quant takes 1 minute....
Do you have plans to support the H100(sm90)? https://github.com/NetEase-FuXi/EETQ/blob/1657b1504faa359e2ce0ac02999439d7ac8c74c0/csrc/cutlass_kernels/cutlass_preprocessors.cc#L113-L128