Mask_RCNN
Mask_RCNN copied to clipboard
ResNeXt backbone
Has anyone implemented the ResNeXt backboned version of Mask R-CNN and tested the results?
I have been working on it for a few days but I keep getting NaN and zero valued losses. I might have some bug in my ResNeXt implementation.
I am currently trying transferring the weights of FAIR version of Mask R-CNN, maybe if I become successful in that, I may provide you the pretrained version of these weights.
model file for ResNet 101 here: https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train%3Acoco_2014_valminusminival/generalized_rcnn/model_final.pkl
Implementation done on my side (ResNext-50 cardinal = 32m ResNext-101 cardinal = 32). Testing it. Will update you if it works
@ericj974 any updates? I sucessfully trained the model using a VGG16 and VGG19 backbone
see model.py and implementation of resnet_graph
@BugaDM what was the accuracy, training and inference times of VGG16 and VGG19 compared to original?
@valikund I just did a small training on my own dataset to check if the backbone was working. I cannot give you any reliable metric, but VGG16 and VGG19 results were worse in terms of accuracy, mAP... The training time was better with both VGGs than with ResNet101. Inference time was the same (around 0.2 seconds with images of size 1024x1024).
I'm trying to implement some more backbones and carry out some full trainings with all of them. I'll update you when I finish.
@ericj974 I tested your model with coco_weights
and image net weights
, however I got the errors:
ValueError: Layer #9 (named "res2a_branch2b") expects 0 weight(s), but the saved weights have 2 element(s)
. Could you tell me how you configure when you test? Thanks a lot
@ericj974 : Thanks for your code. Could you tell me the performance of ResnetXt in comparison with Resnet 101 and Resnet 50?
@BugaDM , what about smaller backbones like MobilNet? That should have faster inference, at the cost of acc/map.
@ericj974 How did you deal with the weights? I got similar errors like @chenyuZha got. Any progress?
@paulcx @chenyuZha sorry for my late reply. Actually you cannot load the h5 file (mask_rcnn_coco.h5) since it assumes a standard resnet and not resnext as backend encoder. You will have to do the training from scratch.
@ericj974 : What is your performance of resnetxt ? How do you train from scratch? I tried the command python3 coco.py train --dataset=/path/to/coco/
(deleted --model=coco
) but it got error
if args.model.lower() == "last":
AttributeError: 'NoneType' object has no attribute 'lower'
@ericj974 Have you tried the other backbone models like inception resnet v2?
Haha I actually did my first run with the mobilenet backbone right before @23pointsNorth had mentioned it. I can confirm that it works from scratch and imagenet pretrained mobilenet weights, though still trying to get the quality of predictions on par with the resnet50 backbone
Hi @Cpruce I am trying to use MobileNET backbone as well. It includes a layer called Depthwise Separable Convolution. Keras offers seperable2dconv layer. Did you just use the seperable2dconv layer or add anything else while modelling the backbone ?
Hi @gsujansai
I saw the seperable2dconv layer in Keras but didn't try it out since I pulled the one from the keras models. This one creates its own _depthwise_conv_block
which contains the pointwise convolutions at the end of each.
@waleedka I have a memory leak but besides that my new backbone works. Shall I open a pull request?
@Cpruce Thank you
For everyone interested, check out my pull request https://github.com/matterport/Mask_RCNN/pull/306
Let me know if you have any improvements and feel free to contribute!
@Cpruce Hello I saw your implementation of maskRcnn with mobilenet224 backbone, very interesting! Do you think it's possible to integrate the model into mobile devices for inference ? (like mobilenet always do)..
@chenyuZha :D yes definitely! I have (somewhat) gotten results on my mobile phone, though it is still pretty slow. You will face a few obstacles to overcome but there is much to do after getting the model to load. Let me know if you start going down this road :)
@Cpruce I actually implemented a very similar MobileNet 224 backbone, it appears to be the same as the one I saw on your repository. Did you try training it on lower resolution images, perhaps (224, 224, 3). If you train it on lower res, and you run inference on lower res it will drastically increase the speed. Although I am still getting NaN errors in rpn_bbox_loss
when trying to do so.
@JonathanCMitchell Cool! nope I haven't tried lower resolution yet. I've still got a few tricks I want to try before, since the average precision and recall still aren't as good. Can you show evaluation results and images with the instance segmentation?
I am still trying to get rid of NaN errors on rpn_bbox_loss when lowering the image dimensions. I have a thread here 321
@JonathanCMitchell moving our conversation to your thread
@Cpruce When you test with your phone, it's with Android studio or Xcode that you integrate your *.pb ?( I guess that you have to convert .h5 to .pb). I have tried to integrate a model(CNN) trained with inception v3 to Android studio and it worked well , but when I tried to use my own custom model pb, it didn't work any more.. I guess that I should modify something in Android Studio? Like input tensor or output tensor.. If you have any idea that will be very very appreciate !!f
Yup! Could you post the summary of your model? You'll get a message telling you all the arguments to use. However, note that the input_layer_shape that it tells you may be wrong if you have multiple inputs
@Cpruce Sorry for reply so late.. In fact I have tested with TF DETECT and TF CLASSIFIER, which finally worked( I made a stupid error before..). But the fact is , for the TF DETECT, I could only use the pre-trained model like mobilenetV1(because il sounds very tricky to modify the input or output node in java script..). I really want to test something new (like Deeplab V3 +mobilenet V2). But since then I haven't found any information of implementation of the script.. As you actually work in mobilenet of MaskRcnn, have you already tested in android studio with this model? Maybe you have some more experience in Android studio..
@ericj974 I have a problem of the resneXt network. In fact I have already trained a model with resneXt and convert .h5 to .pb successfully.(with your file export_model.py) . But when I tried to do the inference with this .pb, I got error of tf.py_func
: UnknownError (see above for traceback): KeyError: 'pyfunc_0' [[Node: mrcnn_detection/PyFunc = PyFunc[Tin=[DT_FLOAT, DT_FLOAT, DT_FLOAT, DT_FLOAT], Tout=[DT_FLOAT], token="pyfunc_0", _device="/job:localhost/replica:0/task:0/device:CPU:0"](ROI/packed_2/_67, mrcnn_class/Reshape_1/_69, mrcnn_bbox/Reshape/_71, _arg_input_image_meta_0_1)]]
. It seems that when we freeze the graph, the tf.py_func
will cause the problem with certain operations. Could you telle me if you have the same issue? How do you solve it ? Thanks for your reply!
@chenyuZha Yes, I have loaded the model on my phone/android studio and have gotten it to run inference. However, predictions are still too slow and I need to work on how I feed/preprocess the image. For your 2nd problem, could you try with the latest master? The last instance of py_func
was removed in my first pull request https://github.com/matterport/Mask_RCNN/pull/167