parseq icon indicating copy to clipboard operation
parseq copied to clipboard

8-bit quantitization with PyTorch 2.0

Open eschaffn opened this issue 2 years ago • 3 comments

Hey there!

Is it possible to do post-training quantitzation with Parseq? I'm looking for ways to speed up inference time. I tried training a parseq-tiny model but lost about 13% absolute val accuracy.

I'm new to quantitization and am unsure about the types of models it benefits or which type of quantitization to use.

Thanks for any suggestions!

eschaffn avatar May 17 '23 22:05 eschaffn

I'm sorry but I'm also new to quantization and model deployments in general.

Another route to take is to use a bigger model, "sparsify" it, then prune the unused connections to optimize inference time.

baudm avatar Jun 09 '23 14:06 baudm

Hey there!

Is it possible to do post-training quantitzation with Parseq? I'm looking for ways to speed up inference time. I tried training a parseq-tiny model but lost about 13% absolute val accuracy.

I'm new to quantitization and am unsure about the types of models it benefits or which type of quantitization to use.

Thanks for any suggestions!

Did you speed up inference time ? And did you quantize with post-training quantization ?

dat080399 avatar Jul 14 '23 02:07 dat080399

I have created the quantization support in a separate fork of this repo which can be checked here https://github.com/VikasOjha666/parseq

By default when the model is trained it will be trained in a quantization-aware training which further helps preserve the accuracy during quantization.

VikasOjha666 avatar Nov 25 '23 20:11 VikasOjha666