opencv_contrib icon indicating copy to clipboard operation
opencv_contrib copied to clipboard

opencv_contrib/modules/cudaimgproc/src/cuda /connectedcomponents.cu line 273

Open ucassdlfj opened this issue 2 years ago • 4 comments

In this line, I was wondering why extract 'info' from cuda::PtrStepSzb img, in kernel 'InitLabeling', all the info for each block are saved in 'labels', as for an image size 5*5, the info for location(4,4) are saved in 'last_pixel' in (3,3) in labels, see line 347 ??

ucassdlfj avatar Dec 06 '23 10:12 ucassdlfj

Could you please provide additional details or a brief explanation? I'm finding it a bit challenging to grasp the issue with the current information provided, and I would appreciate more context to better understand what you're trying to solve. Thank you.

yashgarg1703 avatar Dec 16 '23 05:12 yashgarg1703

In this line, I was wondering why extract 'info' from cuda::PtrStepSzb img, in kernel 'InitLabeling', all the info for each block are saved in 'labels', as for an image size 5*5, the info for location(4,4) are saved in 'last_pixel' in (3,3) in labels, see line 347 ??

Your question does not have enough detail. That said it seems like you are asking why you would want to save the initial value from InitLabeling to global memory and then run an additional kernel to read it back as this seems unecessary when you have a 5x5 image?

If so this is a question for the forum and not an issue. That said the reason is images are processed in blocks by seperate groups of threads and the "only" (actually it depends on the version of CUDA and how its implemented, but here at least) way they can communicate with each other is through global memory. You could process a single 5x5 image in a single kernel but I can't think of a reason why you would process such a small image on the GPU.

cudawarped avatar Dec 16 '23 06:12 cudawarped

@cudawarped, I‘m sorry for the oversimplified issue, I'm a CUDA developer and an image processing algorithm engineer, so here is my personal understanding, in file opencv_contrib/modules/cudaimgproc/src/cuda /connectedcomponents.cu, five kernel functions are executed by order to implementation the algorithm of connected components, Let's assume “img” with odd width and height, take 5 and 5 for example.

1、In kernel function "InitLabeling", "info" are saved in address pointed by "last_pixel", for the thread(2, 2) in thread block (0,0), local variable "row" equals to 4, "col" equals to 4, so the condition in line 200(col+1<labels.cols) and 203(row+1<labels.rows) are not met, so "info" for pixel(4,4) are saved in (3,3) (Bacause the input pointer "last_pixel" equals to labels.data + ((labels.rows - 2) * labels.step) + (labels.cols - 2) * labels.elemSize()), Bellow is a simplified layout of "labels":

|---------------------------------> | labels(0,0) | info(0, 0) | labels(2, 0) | info(2, 0) | labels(4,0)
| 0 | 0 | 0 | 0 | info(4,0) | labels(0,2) | info(0,2) | labels(2,2) | info(2,2) | labels(4,2) | 0 | 0 | 0 | info(4,4) | info(4,2) | labels(0,4) | info(0,4) | labels(2,4) | info(2,4) | labels(4,4) | v 2、kernel function "Compression" and "Merge" update labels according to "info" in each block(2*2); 3、In kernel "FinalLabeling" labels in each blocks are set according to "info", so as for (4,4), aren't we supposed to extract "info" from (3,3) in labels ? but in souce code line 271 to line 273: """ // Read from the input image // "a" is already in position 0 info = img[row * img.step + col]; """ Technically speaking, I don't understand the code here, "info" for each block has nothing to do with "img" before this. Maybe this "issue" don't cause much influence to the results consider this only happens when image has odd width and height, but I think it's worthy to point it out.

ucassdlfj avatar Dec 18 '23 03:12 ucassdlfj

In kernel "FinalLabeling" labels in each blocks are set according to "info", so as for (4,4), aren't we supposed to extract "info" from (3,3) in labels ? but in souce code line 271 to line 273

It looks that way although the comment seems to imply otherwise. From a quick scan of the paper I would also naivly assume this should be from the label image (I8 in figure 8), but without that line it fails the tests. I would ask @stal12 for an explanation.

Can you supply a test case where this fails to correctly calculated the CCL?

cudawarped avatar Dec 18 '23 10:12 cudawarped