Alexander comments

Results 23 comments of


                                            Alexander

✨[Feature] support other methods compile besides 'foward' method

> 1. We might need to look at FreezeModule preserved attributes Yes, I used to make a try by using FreezeModule preserved attributes, but failed.

AttributeError: 'RepVGGBlock' object has no attribute 'conv'

@wanyne-yyds which model do you use? v6s_reopt or v6s?

AttributeError: 'RepVGGBlock' object has no attribute 'conv'

@wanyne-yyds ops' name in op_concat_fusion_list are from v6s_reopt, if you use v6s to apply partial ptq, you need to modify ops'name in op_concat_fusion_list.

qat训练模型的问题

@dejavvuu 1. you can remove quant/de-quant ops from onnx graph before deployment. 2. ~0.1 mAP drop is normal.

校准 HF_MODEL=./llama-2-7b WORK_DIR=../llama-2-7b-awq python3 -m lmdeploy.lite.apis.calibrate \ --model $HF_MODEL \ --calib_dataset 'c4' \ --calib_samples 128 \ --calib_seqlen 2048 \ --work_dir $WORK_DIR 量化 HF_MODEL=./llama-2-7b WORK_DIR=../llama-2-7b-awq-64 python3 -m lmdeploy.lite.apis.auto_awq \ --model $HF_MODEL...

[bug] group_size = 64 has bug

@irexyc 不是，我用官方模型自己走了一遍转换部署流程。

What is the difference between `get_act_scales` and `get_static_decoder_layer_scales`

@CaffreyR get_act_scales to smooth act and weight get_static_decoder_layer_scales to quantize act and weight

where is the file 'loose_bb_train.csv'

http://www.robots.ox.ac.uk/~vgg/data/vgg_face2/ , please download from this website

about"the same shape, except at concat_axis"?

@yang0817manman make sure the width and height of image are 128x128.

how's the inference speed

vgg16+s3fd hard to be realtime on mobile device, if you want a realtime model, you can you mv2+s3fd. https://github.com/lippman1125/S3FD.PyTorch