pytorch-image-models icon indicating copy to clipboard operation
pytorch-image-models copied to clipboard

update post-quantization feature with various model supports

Open tgisaturday opened this issue 5 years ago • 6 comments

Prototype update for post-quantization support with torch.quantization. Tested with torch 1.6.

Supported models:

  • efficientnet family
  • mobilenetv3
  • rexnet

Quantizable models and modifications made to support quantizations are located in timm/models/quantization. To run post-quantization, use post_quantization_validation.py. In addition to basic validation.py configs, you need to specify data_path for calibration(using training dataset is recommended) and iterations for calibration(default 100).

I've tested with my custom models and showed average 1~2% accuracy drop and more than 50% reduction in latency and weight file size. As reported in #204 , using pretrained weights for supported models shows average 5% performance drop(efficientnet_lite0: 68.364). Modifications for quantization support may be the cause. Latency (60%) and file size (75%) reduction rate seems reasonable for all models.

tgisaturday avatar Sep 21 '20 11:09 tgisaturday

@tgisaturday thanks for the PR, I'm going to leave here for now (unmerged). There are too many duplications of loader, model, and layer code, etc for me to merge this in to the mainline. I do not want to end up mantaining 2x the code. But this may be a useful reference for those wanting to use quantization with these models.

I will consider merging a quantization PR if it can be done in a manner that does not significantly alter existing models or does not duplicate code of existing models and minimally duplicates layers. I realize that is challenging the way native quantization is currently setup in PyTorch, and even Torchvision duplicates a lot of code for quantized models. There's got to be a better way of modifying the models with scripts, leveraging the IR of torchscript, or using TVM, etc to do this without maintaining duplicate model definitions...

rwightman avatar Sep 23 '20 17:09 rwightman

@rwightman Yes, I'm also trying to come up with practical quantizable model converting mechanism for pytorch but stil struggling... native pytorch conversion tools are not working properly and the best practice so far is rewriting it from scratch.

tgisaturday avatar Sep 24 '20 04:09 tgisaturday

@tgisaturday or anyone who reads this...

With the beta FX feature in PyTorch 1.8, a better approach to post-quantization appears to be on the horizon, I'd be open to a PR that can post-quantize any model here using transforms, without requiring and modifications to the original model definitions (if that's possible).

rwightman avatar Mar 05 '21 18:03 rwightman

@tgisaturday or anyone who reads this...

With the beta FX feature in PyTorch 1.8, a better approach to post-quantization appears to be on the horizon, I'd be open to a PR that can post-quantize any model here using transforms, without requiring and modifications to the original model definitions (if that's possible).

@rwightman Cool! I'll take a closer look into it and make another PR.

tgisaturday avatar Mar 21 '21 10:03 tgisaturday

@rwightman I'm currently working on dynamic(weight only) and static quantization with fx_feature. I haven't tried actual quantization yet. I'm also planning to add quantization-aware training feature including scratch quantization-aware training with StatAssist & GradBoost suggested in https://github.com/clovaai/frostnet. I'll keep you updated with the progress.

tgisaturday avatar Mar 22 '21 07:03 tgisaturday

Could you please give an example for showing how to run your post-quantization code? I have tried your commit like following:

from timm.models.quantization import quant_rexnet_100
model = quant_rexnet_100(pretrained=True)

but there are no pretrained weights available

RuntimeError: Error(s) in loading state_dict for ReXNetV1:                                                                                                          
        Missing key(s) in state_dict: "stem.bn1.weight", ...
        Unexpected key(s) in state_dict: "stem.bn.weight", ...

even if I load the model with pretrained=False also encountered some errors when doing quantization (with pytorch1.8.1):

from timm.models.quantization import quant_rexnet_100
model = quant_rexnet_100(pretrained=False)
model_to_quantize = copy.deepcopy(model)
model_to_quantize.eval()
qconfig_dict = {"": torch.quantization.get_default_qconfig('qnnpack')}
model_prepared = quantize_fx.prepare_fx(model_to_quantize, qconfig_dict)
model_quantized = quantize_fx.convert_fx(model_prepared)

error

  File "/home/xxx/.local/lib/python3.6/site-packages/torch/fx/symbolic_trace.py", line 191, in path_of_module
    raise NameError('module is not installed as a submodule')
NameError: module is not installed as a submodule

So I have some major questions as following:

  1. Is this PR able to do post-quantization on the existing timm float pretrained models? Must we train the float pretrained models with your PR before doing the post-quantization process?
  2. Does line307 mean that only weight quantization was performed and feature are still remain float?
  3. How to load model_fp?

@tgisaturday

kebijuelun avatar Aug 03 '21 06:08 kebijuelun