PaddleSlim
PaddleSlim copied to clipboard
Transformer 动态图PTQ量化报错
paddle版本: 2.2.0 paddleslim版本:2.2.0
代码:
from paddleslim import PTQ
from paddlenlp.transformers import InferTransformerModel
transformer = InferTransformerModel(
src_vocab_size=20000,
trg_vocab_size=20000,
max_length=128,
num_encoder_layers=6,
num_decoder_layers=2,
n_head=8,
d_model=256,
d_inner_hid=2048,
dropout=0.1,
weight_sharing=False,
bos_id=0,
eos_id=1,
beam_size=2,
max_out_len=128
)
transformer.eval()
ptq = PTQ()
quant_model = ptq.quantize(transformer, fuse=True, fuse_list=None)
ptq.save_quantized_model(quant_model, 'models',
paddle.static.InputSpec(
shape=[1, 128],
dtype='int64')
)
报错:
Traceback (most recent call last):
File "ptq.py", line 153, in <module>
main(args)
File "ptq.py", line 142, in main
dtype='int64')
File ".../.conda/envs/paddle2/lib/python3.6/site-packages/paddleslim/dygraph/quant/ptq.py", line 146, in save_quantized_model
model=model, path=path, input_spec=input_spec)
File ".../.conda/envs/paddle2/lib/python3.6/site-packages/paddle/fluid/contrib/slim/quantization/imperative/ptq.py", line 139, in save_quantized_model
self._convert(model)
File ".../.conda/envs/paddle2/lib/python3.6/site-packages/paddle/fluid/contrib/slim/quantization/imperative/ptq.py", line 203, in _convert
self._save_output_thresholds(sub_layer, sub_layer._quant_config)
File ".../.conda/envs/paddle2/lib/python3.6/site-packages/paddle/fluid/contrib/slim/quantization/imperative/ptq.py", line 261, in _save_output_thresholds
assert len(output_thresholds) == 1
AssertionError
您好,已经复现该问题。正在排查错误原因。
您好,
quant_model = ptq.quantize(transformer, fuse=True, fuse_list=None)
ptq.save_quantized_model(quant_model, 'models',
paddle.static.InputSpec(
shape=[1, 128],
dtype='int64')
)
在以上两句调用之间需要加一行校准操作:https://github.com/PaddlePaddle/PaddleSlim/blob/develop/demo/dygraph/post_quant/ptq.py#L129
加校准操作,就是过一些测试数据,来计算统计激活的threshold。如果没有这个操作,则threshold为空,就触发了您在日志中看到的报错。
相关文档:https://github.com/PaddlePaddle/PaddleSlim/tree/develop/demo/dygraph/post_quant#校准模型
好的,我这边试一下。
按照上面操作加了数据训练,可以保存模型,但是使用 paddle_lite_opt转换的时候会报下面的错,请问这个是配置问题或者算子不支持问题?
paddle_lite_opt --model_file=infer_small_model.pdmodel --param_file=infer_small_model.pdiparams --optimize_out_type=naive_buffer --optimize_out=test --valid_targets=arm --quant_model=true --quant_type=QUANT_INT8
报错信息:
[F 12/ 9 7:22:44.246 .../Paddle-Lite-develop/lite/core/op_lite.h:73 InferType] Error! fill_zeros_like::InferType() function must be registered for op fill_zeros_like
Aborted (core dumped)
按照上面操作加了数据训练,可以保存模型,但是使用 paddle_lite_opt转换的时候会报下面的错,请问这个是配置问题或者算子不支持问题?
这是算子不支持。 不加量化可以在Paddle-Lite正常部署么?
不量化是可以推理的,但是模型太大了,需要通过压缩减少以下内存的占用,希望能够同时达到加速推理的效果。