opencv_contrib
opencv_contrib copied to clipboard
opencv_contrib/modules/cudaimgproc/src/cuda /connectedcomponents.cu line 273
In this line, I was wondering why extract 'info' from cuda::PtrStepSzb img, in kernel 'InitLabeling', all the info for each block are saved in 'labels', as for an image size 5*5, the info for location(4,4) are saved in 'last_pixel' in (3,3) in labels, see line 347 ??
Could you please provide additional details or a brief explanation? I'm finding it a bit challenging to grasp the issue with the current information provided, and I would appreciate more context to better understand what you're trying to solve. Thank you.
In this line, I was wondering why extract 'info' from cuda::PtrStepSzb img, in kernel 'InitLabeling', all the info for each block are saved in 'labels', as for an image size 5*5, the info for location(4,4) are saved in 'last_pixel' in (3,3) in labels, see line 347 ??
Your question does not have enough detail. That said it seems like you are asking why you would want to save the initial value from InitLabeling to global memory and then run an additional kernel to read it back as this seems unecessary when you have a 5x5 image?
If so this is a question for the forum and not an issue. That said the reason is images are processed in blocks by seperate groups of threads and the "only" (actually it depends on the version of CUDA and how its implemented, but here at least) way they can communicate with each other is through global memory. You could process a single 5x5 image in a single kernel but I can't think of a reason why you would process such a small image on the GPU.
@cudawarped, I‘m sorry for the oversimplified issue, I'm a CUDA developer and an image processing algorithm engineer, so here is my personal understanding, in file opencv_contrib/modules/cudaimgproc/src/cuda /connectedcomponents.cu, five kernel functions are executed by order to implementation the algorithm of connected components, Let's assume “img” with odd width and height, take 5 and 5 for example.
1、In kernel function "InitLabeling", "info" are saved in address pointed by "last_pixel", for the thread(2, 2) in thread block (0,0), local variable "row" equals to 4, "col" equals to 4, so the condition in line 200(col+1<labels.cols) and 203(row+1<labels.rows) are not met, so "info" for pixel(4,4) are saved in (3,3) (Bacause the input pointer "last_pixel" equals to labels.data + ((labels.rows - 2) * labels.step) + (labels.cols - 2) * labels.elemSize()), Bellow is a simplified layout of "labels":
|--------------------------------->
| labels(0,0) | info(0, 0) | labels(2, 0) | info(2, 0) | labels(4,0)
| 0 | 0 | 0 | 0 | info(4,0)
| labels(0,2) | info(0,2) | labels(2,2) | info(2,2) | labels(4,2)
| 0 | 0 | 0 | info(4,4) | info(4,2)
| labels(0,4) | info(0,4) | labels(2,4) | info(2,4) | labels(4,4)
|
v
2、kernel function "Compression" and "Merge" update labels according to "info" in each block(2*2);
3、In kernel "FinalLabeling" labels in each blocks are set according to "info", so as for (4,4), aren't we supposed to extract "info" from (3,3) in labels ? but in souce code line 271 to line 273:
"""
// Read from the input image
// "a" is already in position 0
info = img[row * img.step + col];
"""
Technically speaking, I don't understand the code here, "info" for each block has nothing to do with "img" before this.
Maybe this "issue" don't cause much influence to the results consider this only happens when image has odd width and height,
but I think it's worthy to point it out.
In kernel "FinalLabeling" labels in each blocks are set according to "info", so as for (4,4), aren't we supposed to extract "info" from (3,3) in labels ? but in souce code line 271 to line 273
It looks that way although the comment seems to imply otherwise. From a quick scan of the paper I would also naivly assume this should be from the label image (I8 in figure 8), but without that line it fails the tests. I would ask @stal12 for an explanation.
Can you supply a test case where this fails to correctly calculated the CCL?