Chenhao Xue comments

Results 8 comments of


                                            Chenhao Xue

How to save quantized model?

We're still working on it, and it'll be released along with quantized models' pth files later. If you only need quantized weights and bias, there's a way for quick start....

SwinTransformer3D

Currently that's not available, sorry for that. Some slight modification could get things work, if: 1. You have calibration dataset for the model and you know how to preprocess the...

我们这篇工作主要考虑Vision transformer，所以没有实现conv3D算子的量化和封装。ViT的embedding操作是使用conv2D实现的，这部分代码在`layers.conv`里，您可以考虑仿照conv2D实现conv3D，从代码的核心逻辑上二者没有什么区别，只要把相关操作的维度增加一下就好了。conv2d当中每一步操作的维度如何变换，我在注释中逐行写出来了，可以用作参考。如果您已经实现了conv3D算子，您还要在`utils.net_wrap`里加上对conv3D的封装操作。

How to load the quantized models with PTQ4ViT into the net?

The quantized models only contains NN parameters in int8, which cannot be loaded and put into use directly. I would suggest you follow the given example files and quantize a...

How to load the quantized models with PTQ4ViT into the net?

I've only taken notice of the issues recently. I'll start working on a more user-friendly interface ASAP, so that you don't have to quantize a model from scratch .

Shape of saved quantized model parameter

We import our ViT model from package `timm`, and this is how they store their weight tensor. Indeed, W_Q, W_K and W_V should be [2304, 768], but `timm` fuses the...

Constrain the scaling factors of the two ranges

PTQ4ViT adopts two classes `SoSMatMul` and `PostGeluLinear` in `quant_layers/`. `SoSMatMul` adopts a variable `split` to quantize post-softmax values. It indicates the split point of two ranges [0, split] and [split,...

Constrain the scaling factors of the two ranges

A_interval_candidates should be initialized with 2 ** m * initial a_neg_interval. Looks like a mistake pop up when merging the released version code, original code for experiment is kind of...