Paddle icon indicating copy to clipboard operation
Paddle copied to clipboard

Optimize performance of depthwise_conv_bwd

Open ZzSean opened this issue 3 years ago • 1 comments

PR types

Performance optimization

PR changes

OPs

Describe

Optimize performance of depthwise_conv_bwd for input

  • Method:
  1. Reduce modulo calculations and other redundant calculations
  2. Modify the config of block/grid
  • Result:
config pytorch paddle dev paddle this PR speedup
input[2048, 1024, 4, 4]
filter[1024, 1, 4, 4]
stride=1
pad=0
dilation=1
1.1070ms 2.9660ms 1.0798ms 2.75x

ZzSean avatar Sep 21 '22 09:09 ZzSean

你的PR提交成功,感谢你对开源项目的贡献! 请关注后续CI自动化测试结果,详情请参考Paddle-CI手册。 Your PR has been submitted. Thanks for your contribution! Please wait for the result of CI firstly. See Paddle CI Manual for details.

paddle-bot[bot] avatar Sep 21 '22 09:09 paddle-bot[bot]