CUDA-Programming icon indicating copy to clipboard operation
CUDA-Programming copied to clipboard

P108 代码有错

Open zhenkunl opened this issue 3 years ago • 3 comments

樊老师好,读完您的著作,对cuda编程有了很多新的认识。读书过程中发现了一处问题,P108提到”如果想要在循环内去掉对线程号的约束,又要避免出现读-写竞争,可以将相关代码改写如下:

real v = 0;
for (int offset = 16; offset > 0; offset >>= 1)
{
      v += s_y[tid + offset];
      __syncwarp();
      s_y[tid] = v;
      __syncwarp();
}

” 问题在于v的初值赋为0是有问题的,比如tid=0的线程,第一次迭代后其值变为第s_y[16],而不是s_y[0]+s_y[16]。 根据我的理解以及与同事的讨论,该处代码改成如下:

real v = s_y[tid];
for (int offset = 16; offset > 0; offset >>= 1)
{
      v += s_y[tid + offset];
      __syncwarp();
      s_y[tid] = v;
      __syncwarp();
}

才能得到正确结果。 为了便于理解,改成如下更妥:

for (int offset = 16; offset > 0; offset >>= 1)
{
      v = s_y[tid + offset];
      __syncwarp();
      s_y[tid] += v;
      __syncwarp();
}

不知我理解是否有问题,请不吝赐教。

zhenkunl avatar Sep 09 '22 05:09 zhenkunl

谢谢,您发给我的邮件已经收到,我会尽快处理。Thank you,the email you sent me has been received and I will handle it as soon as possible.王景博fever wong

fever-Wong avatar Sep 09 '22 05:09 fever-Wong

你好,非常感谢你发现这个错误。v 的初始化确实写错了,在做+=操作之前,确实要先用共享内存的数据赋值。我会在主页面更正这个错误。这段代码不在仓库的示例代码中,故没有源代码的修改。

brucefan1983 avatar Sep 09 '22 10:09 brucefan1983

非常期待第二版有更多精彩内容呈现

zhenkunl avatar Sep 09 '22 10:09 zhenkunl

谢谢,您发给我的邮件已经收到,我会尽快处理。Thank you,the email you sent me has been received and I will handle it as soon as possible.王景博fever wong

fever-Wong avatar Oct 11 '22 08:10 fever-Wong