parseq 8-bit quantitization with PyTorch 2.0

8-bit quantitization with PyTorch 2.0

Open eschaffn opened this issue 2 years ago • 3 comments

Hey there!

Is it possible to do post-training quantitzation with Parseq? I'm looking for ways to speed up inference time. I tried training a parseq-tiny model but lost about 13% absolute val accuracy.

I'm new to quantitization and am unsure about the types of models it benefits or which type of quantitization to use.

Thanks for any suggestions!

May 17 '23 22:05 eschaffn

I'm sorry but I'm also new to quantization and model deployments in general.

Another route to take is to use a bigger model, "sparsify" it, then prune the unused connections to optimize inference time.

Jun 09 '23 14:06 baudm

Hey there!

Is it possible to do post-training quantitzation with Parseq? I'm looking for ways to speed up inference time. I tried training a parseq-tiny model but lost about 13% absolute val accuracy.

I'm new to quantitization and am unsure about the types of models it benefits or which type of quantitization to use.

Thanks for any suggestions!

Did you speed up inference time ? And did you quantize with post-training quantization ?

Jul 14 '23 02:07 dat080399

I have created the quantization support in a separate fork of this repo which can be checked here https://github.com/VikasOjha666/parseq

By default when the model is trained it will be trained in a quantization-aware training which further helps preserve the accuracy during quantization.

Nov 25 '23 20:11 VikasOjha666

parseq parseq copied to clipboard

8-bit quantitization with PyTorch 2.0

parseq
parseq copied to clipboard