EETQ icon indicating copy to clipboard operation
EETQ copied to clipboard

Easy and Efficient Quantization for Transformers

Results 4 EETQ issues
Sort by recently updated
recently updated
newest added

Hi is there a way to run EETQ without accelerator ? at least for the quantization process thanks

The training process is quite slow, whereas using 8-bit hqq speeds it up by more than tenfold. Is this normal? Or have I missed any code? ```python import torch from...

Using TGI or Lorax eetq quantization takes several minutes (Eg 10 minutes for Mixtral) every time the launcher is run . As a reference bitsandbytes nf4 quant takes 1 minute....

Do you have plans to support the H100(sm90)? https://github.com/NetEase-FuXi/EETQ/blob/1657b1504faa359e2ce0ac02999439d7ac8c74c0/csrc/cutlass_kernels/cutlass_preprocessors.cc#L113-L128