crnn.pytorch.tensorrt.chinese 如何转换成fp16或者int8类型的

如果想转换成trt支持的fp16或者Int8类型，应该如何修改。

Apr 14 '21 05:04 zhong-xin

如果想转换成trt支持的fp16或者Int8类型，应该如何修改。

目前默认是转成fp16的，见crnn_trt/crnn_number.cpp 里的 #define USE_FP16 （如果不想转，把这一行去掉就行），目前转INT8没有支持，等我弄一下

Apr 14 '21 06:04 ygfrancois

好的了解了，感谢你分享的代码，对我很有帮助。

Apr 14 '21 06:04 zhong-xin

好的了解了，感谢你分享的代码，对我很有帮助。

感谢关注，等我把int8弄好了告诉你。

Apr 14 '21 09:04 ygfrancois

Does it also work for traditional chinese?

Apr 14 '21 11:04 vllsm

@ygfrancois 输入图片32*100，直接使用pytorch推理时间为14.46ms，使用fp32量化是6.69ms，使用fp16量化是6.32ms。为什么fp32和fp16量化的差异这么小，这是否正常。

Apr 19 '21 07:04 zhong-xin

Does it also work for traditional chinese?

the network of course yes, but the pretrained weights is not supported for the traditional chinese, you need to train with the traditional chinese dataset by yourself

Apr 20 '21 06:04 ygfrancois

@ygfrancois 输入图片32*100，直接使用pytorch推理时间为14.46ms，使用fp32量化是6.69ms，使用fp16量化是6.32ms。为什么fp32和fp16量化的差异这么小，这是否正常。

你用的显卡是2080ti吗？显存减少明显吗？我估计和硬件或者cudnn的内部实现有关

Apr 20 '21 06:04 ygfrancois

@ygfrancois 输入图片32*100，直接使用pytorch推理时间为14.46ms，使用fp32量化是6.69ms，使用fp16量化是6.32ms。为什么fp32和fp16量化的差异这么小，这是否正常。

你用的显卡是2080ti吗？显存减少明显吗？我估计和硬件或者cudnn的内部实现有关

用的是TX2，pytorch模型占用显存854M，fp32量化后是1G，fp16量化后是726M。

Apr 20 '21 06:04 zhong-xin

crnn.pytorch.tensorrt.chinese crnn.pytorch.tensorrt.chinese copied to clipboard

如何转换成fp16或者int8类型的

crnn.pytorch.tensorrt.chinese
crnn.pytorch.tensorrt.chinese copied to clipboard