FastMaskRCNN icon indicating copy to clipboard operation
FastMaskRCNN copied to clipboard

Does anyone have idea on Testing the model?

Open sairin1202 opened this issue 7 years ago • 17 comments

I have trained the model for 500000 iterations and finally gained the ckpt file, but I have no idea on testing it. Has anyone finished testing the model?

sairin1202 avatar Jun 14 '17 08:06 sairin1202

Maybe this will help: https://github.com/CharlesShang/FastMaskRCNN/issues/26

hari-sikchi avatar Jun 14 '17 12:06 hari-sikchi

Thank you very much but perhaps the datatest.py is used to test the train data on coco dataset and does not use the trained network.

sairin1202 avatar Jun 16 '17 03:06 sairin1202

@sairin1202 It would be very nice if you may share related files of a 500k iterated model, so that we don't spend sources for it, thanks. Currently, I am trying to write such code to test our trained models.

Cangrc avatar Jun 16 '17 12:06 Cangrc

@Cangrc https://www.dropbox.com/s/k0vuldvzsxc18xu/coco_resnet50_model.ckpt-499999.zip?dl=0 Here is the model and if you have ideas , notify me as soon as possible . Thank you very much.

sairin1202 avatar Jun 16 '17 13:06 sairin1202

I also want to know how to test...

mariolew avatar Jun 19 '17 09:06 mariolew

I don't know how to test the trained models.

cuihuitao avatar Jun 19 '17 11:06 cuihuitao

Same

abduallahmohamed avatar Jun 20 '17 11:06 abduallahmohamed

Thanks for sharing, but can u share file named 'checkpoint' and which TF version did you use?

imyourm8 avatar Jun 22 '17 09:06 imyourm8

@sairin1202 , Thanks for sharing the 500k model.

@CharlesShang , I have build an inference code for testing the model with random images. But I stuck in following 2 places 1)pyramid_netwrk.py - build_heads line no 291 if not is_training: ,you are assigning final_boxes as roi. But the subsequent line (line no293 and 294) are commented . So assigned_rois and assigned_batch_inds (line no 296 -297) are populated at line no 255 which is out come of rpm. So during testing (when is_training =False) mask head is depending on rpm rois not on refined head(final_boxes). 2)sample.py-sample_rpn_outputs_wrt_gt_boxes.( This is being invoked from pyramid_netwrk.py -> build_heads->sample_rpn_outputs_with_gt) This method checking the gt_boxes.size at line no 86. However during testing we should not have any gt_boxes. What is the implication if this if block does not executed during testing phase.If only Else(line no 113) block works during testing phase. It would be great help if you clarify these two points

Thanks in advance Dipanjan

dipanjan06 avatar Jun 22 '17 10:06 dipanjan06

please take into account that any model trained before #50 is probably useless

amirbar avatar Jun 22 '17 15:06 amirbar

@amirbar Thanks for your advice . it is a very a good observation. Thanks once more. Now I am training the network using Quadro K5200 which has got 8GB GPU memory .However I am getting out of Memory error and hence not able to train. Is there any way to train the network with 8GB GPU or if you have already trained the network then will it be possible to share the same.

Thanks in advance Dipanjan

dipanjan06 avatar Jun 23 '17 07:06 dipanjan06

@dipanjan06 In https://github.com/CharlesShang/FastMaskRCNN/blob/master/train/train.py#L230 May be a lower fraction can solve your problem

ray0809 avatar Jul 25 '17 07:07 ray0809

@sairin1202 Would you please share the 'checkpoint' file after 500K iterations?

ray0809 avatar Jul 25 '17 09:07 ray0809

I can load the model from the Checkpoint (as below), and get predictions from it. However, I can only get predictions for an output_tensor before the ROIAlign layer, but not for any tensor in the graph after that layer. Example:

# this works
output_tensor = graph.get_tensor_by_name("pyramid_1/ROIAlign_2/Crop:0")
# this doesn't work
output_tensor = graph.get_tensor_by_name("pyramid_1/ROIAlign_2/Reshape_2:0")

There's a bug in the ROIAlign layer with the py_func used to generate anchors.

  • My understanding is that the py_fun is not stored in the graph and cannot be reconstructed.
  • Or is this bug related to casting anchor coordinates to ints?
  • Can anyone here (@amirbar, @sairin1202 , @hari-sikchi , @Cangrc , @mariolew , @cuihuitao) try to load the graph this way, and reproduce the error / find a workaround ?
tf.reset_default_graph()
checkpoint_dir = "/my/directory/for/segmentation/FastMaskRCNN/output/mask_rcnn/500k/"
meta_file_path = checkpoint_dir + "coco_resnet50_model.ckpt-499999.meta"
ckpt_file_path = checkpoint_dir + "coco_resnet50_model.ckpt-499999"

# load graph
saver = tf.train.import_meta_graph(meta_file_path, clear_devices=True)
graph = tf.get_default_graph()

# initialize and load weights
sess =  tf.Session()
sess.run(tf.global_variables_initializer())
saver.restore(sess, ckpt_file_path)

# Input images
input_tensor = graph.get_tensor_by_name("Reshape_2:0")
output_tensor = graph.get_tensor_by_name("pyramid_2/MaskEncoder/Reshape_2:0")

# Mask output 
results = sess.run(
    [output_tensor],
    #[tf.reduce_mean(input_tensor)],
    feed_dict={input_tensor:images}
)

simaoh avatar Aug 17 '17 23:08 simaoh

@simaoh Could you please tell me where the variable ‘images’ in the following block comes from ,thanks a lot.

# Mask output 
results = sess.run(
    [output_tensor],
    #[tf.reduce_mean(input_tensor)],
    feed_dict={input_tensor:images}
)

And I am still wondering how to show the matiing result of a new image. Is the procedure

run python download_and_convert_data.py to build tf-records

in readme.md necessary for this the testing?

XieSufe avatar Sep 10 '17 03:09 XieSufe

@XieSufe, images is just any list of numpy arrays you want to feed to the graph. As an example just do

import PIL
pim = PIL.Image.open('/dir/to/image.jpg')
nim = np.array(pim)
images= [nim]

simaoh avatar Sep 14 '17 01:09 simaoh

@simaoh

Made any progress with this? I'm running into this issue whilst trying to freeze the model. As far as I'm aware the only way to get around it is to implement whatever is wrapped by py_func using only TF ops.

louisquinn avatar Oct 17 '17 07:10 louisquinn