dct_cuda icon indicating copy to clipboard operation
dct_cuda copied to clipboard

Some ideas of improvement on dct and idct

Open ZixuanJiang opened this issue 6 years ago • 0 comments

  • multiplying scale in precomputeExpk

  • zero paddings to avoid branch divergence

  • in-place or out-of-place cufft, especially in idct

  • number of threads in idct M/2 * N/2 or M/2 * (N/2+1)

  • other improvements based on profiling

ZixuanJiang avatar Apr 29 '19 16:04 ZixuanJiang