faster-rcnn-resnet icon indicating copy to clipboard operation
faster-rcnn-resnet copied to clipboard

Why reshaping the top blob in RPN ?

Open LucasMahieu opened this issue 6 years ago • 7 comments

I don't understand why there is this line : top[0].reshape(1, 5) in this file.

According to what I have understand, and according to the comment juste before this line :

     # rois blob: holds R regions of interest, each is a 5-tuple
     # (n, x1, y1, x2, y2) specifying an image batch index n and a
     # rectangle (x1, y1, x2, y2)

Are you sure this reshape is correct ?

LucasMahieu avatar Mar 08 '18 12:03 LucasMahieu

the first dim are dummy indices which are set to be 0 and expected by loss layer

Microos avatar Apr 04 '18 01:04 Microos

I agree with that, the problem is that the reshape should be : top[0].reshape(N, 5)

With N = cfg.POST_NMS_KEEP

Is not right ?

LucasMahieu avatar Apr 04 '18 13:04 LucasMahieu

Where is this top[0].reshape(1, 5)? It's in setup() or forward()?

Microos avatar Apr 04 '18 13:04 Microos

in the setup part

LucasMahieu avatar Apr 04 '18 13:04 LucasMahieu

And then, a reshape is done in the forward function...

But it should be more logical to shape top[0] to the right shape in the setup() step, according to me.

LucasMahieu avatar Apr 04 '18 13:04 LucasMahieu

OK, I just saw the line you referred to.
This top, including other tops, that defined in setup() is used for caffe to perform the check at network initialization stage. The caffe will try to check if the dimension of all the blobs matches. Imagine that caffe create a dummy data according to your top's shape and let it flow through the whole network to check if the dimension of each layer's output in valid in all the subsequent layers. Therefore, you don't need to set the top to (N,5) cuz it's the shape that matters instead of the data inside. And of course, you cannot set the top to any other shape, it will failed when caffe performing the initialization. Hope this would help :D

Microos avatar Apr 04 '18 13:04 Microos

Then in forward() function which is the time that caffe has your real data flowing around, in this function, you can spit out the top that has dynamic data shape, say (2000, 5) as this proposal layer will normally do.

Microos avatar Apr 04 '18 14:04 Microos