FastMaskRCNN icon indicating copy to clipboard operation
FastMaskRCNN copied to clipboard

Some inconsistent details compared with FPN paper

Open Tete-Xiao opened this issue 7 years ago • 7 comments

Hi,

I've found out that there are several details which are inconsistent with original FPN paper.

  1. The authors claim that the RPN head are shared across all levels of features.
  2. The RoIs will be cropped from some feature map based on their height and width.

Do I miss-read your codes or I miss-read the paper? Or you just decide not to implement in this way for now.

Best, Jason

Tete-Xiao avatar Apr 05 '17 17:04 Tete-Xiao

Meanwhile, the FPN paper clains that they include the anchor boxes that are outside the image for training. It seems your codes employ the original rpn implementation.

1292765944 avatar Apr 06 '17 11:04 1292765944

@1292765944

  1. anchors outside the image: There is a param: allow_border controlls how many pixels an anchor can spread outside

@jasonhsiao97

  1. RoI Cropping w.r.t. roi height and width. I also considered that and I choose not to do it that way. Here are my reasons:
  • Only IoU greater than 0.7 are considered as positive, which indicates a roi is highly consistent to the current layer, so there might be little need to re-choose layers. That's also why I dont explictly assign ground-truth boxes to each layer. Since the 0.7-IoU-sampling will cover that.
  • This is the first version, I'll make that up if there's performance drop. And I seperate sample function from decoder to remain rooms for re-cropping rois according to its scale
  1. RPN head shared cross layers
  • Yes. That's simple, and I'll leave that part to the last.

Thank you for your participating

CharlesShang avatar Apr 07 '17 10:04 CharlesShang

@CharlesShang @jasonhsiao97 Is it right that all head architecture(Mask branch, Fast RCNN) should be shared on every scale of FPN layer?

@CharlesShang I have few questions about your code:

  1. When building pyramid losses, you only put two box losses to collection, and in resnet50_test, you only use those two losses to update variable. Is this behaviour done on purpose?
  2. If I do not misread your code, your code at the moment seems to only allow batch size equal to 1. (Is it right? because your crop.py seems to only crop features from the first image).

santisy avatar Apr 08 '17 17:04 santisy

Hi @CharlesShang , I have the same question with @santisy In line 61 resnet50_test.py, the losses of pyramid_network were calculated but never use **outputs = pyramid_network.build_losses(pyramid, outputs, gt_boxes, gt_masks,num_classes=81, base_anchors=15),** It seems you only update the gradient based on the rpn_box_loss and refined_box_loss. loss = tf.get_collection(tf.GraphKeys.LOSSES) regular_loss = tf.get_collection(tf.GraphKeys.REGULARIZATION_LOSSES) total_loss = tf.add_n(loss + regular_loss) do I misunderstood your code? Is this a bug or you do this on purpose?

lihungchieh avatar Apr 12 '17 06:04 lihungchieh

@santisy @lihungchieh Of the function "build_losses" in pyramid_network.py , the author defined other terms of losses by "tf.losses.softmax_cross_entropy"

And the document said this will automatically register the computed loss into "tf.GraphKeys.LOSSES" https://www.tensorflow.org/versions/r1.1/api_docs/python/tf/losses/softmax_cross_entropy

Also, you can try to print out "loss" after "loss = tf.get_collection(tf.GraphKeys.LOSSES)" in resnet50_test.py Its length is 20 (5 loss terms * 4 different stride size in FPN)

Hope this help~

AaronYALai avatar Apr 12 '17 10:04 AaronYALai

@AaronYALai , Yes, the losses will be registered to tf.GraphKeys.LOSSES automatically, thanks for making this clear.

lihungchieh avatar Apr 12 '17 11:04 lihungchieh

@Tete-Xiao @CharlesShang @santisy @1292765944 In FPN, I just find "P2, P3, P4, P5", but in the original paper, there is "P6". Where is "P6" in this project? Do I miss something?

Superlee506 avatar Apr 25 '18 04:04 Superlee506