AutoDock-GPU
AutoDock-GPU copied to clipboard
Invocations to __syncthreads() and barrier() in calcMergeEneGra{.cu/.cl}
Hi,
By comparing the CUDA and OpenCL versions of gpu_calc_energrad()
(develop branch), I found the following invocations to __syncthreads()
do not have their barrier()
counterparts:
- https://github.com/ccsb-scripps/AutoDock-GPU/blob/develop/cuda/calcMergeEneGra.cu#L414
- https://github.com/ccsb-scripps/AutoDock-GPU/blob/develop/cuda/calcMergeEneGra.cu#L696
- https://github.com/ccsb-scripps/AutoDock-GPU/blob/develop/cuda/calcMergeEneGra.cu#L816
It seems to me these __syncthreads()
invocations are not required, but I might be missing something. Also, running few tests after removing such __syncthreads()
results in virtually no performance benefits.
Could you please check and comment on that?
@L30nardoSV Those synchronization points (along with thread fences - which i did remove after thorough testing) where put in there during the later stages of bug hunting and since they don't hurt performance I didn't mind leaving them in.
So yes, you are probably correct but I might also be missing something ;-)
Edit: Also, I don't think a corresponding barrier is needed on the OpenCL side ...