TensorFlowSharp icon indicating copy to clipboard operation
TensorFlowSharp copied to clipboard

Retinanet - how to perform inference

Open rbgreenway opened this issue 6 years ago • 3 comments

I've been using TensorflowSharp with Faster RCNN successfully for a while now; however, I recently trained a Retinanet model (using Keras/Python3.5), verified it works in python, and have created a frozen pb file for use with Tensorflow. For FRCNN, there is an example in the TensorflowSharp GitHub repo that shows how to run/fetch this model. For Retinanet, I tried modifying the code but nothing seems to work. I have a model summary for Retinanet that I've tried to work from, but it's not obvious to me what should be used. The problem appears to be the parameters for the "Fetch" portion of the Runner.

For FRCNN, the graph is run in this way:

  var runner = m_session.GetRunner();

     runner
         .AddInput(m_graph["image_tensor"][0], tensor)
         .Fetch(
         m_graph["detection_boxes"][0],
         m_graph["detection_scores"][0],
         m_graph["detection_classes"][0],
         m_graph["num_detections"][0]);

        var output = runner.Run();

         var boxes = (float[,,])output[0].GetValue(jagged: false);
         var scores = (float[,])output[1].GetValue(jagged: false);
         var classes = (float[,])output[2].GetValue(jagged: false);
         var num = (float[])output[3].GetValue(jagged: false);

From the model summary for FRCNN, it is obvious what the input ("image_tensor") and outputs ("detection_boxes", "detection_scores", "detection_classes", and "num_detections") are. They are not the same for Retinanet (I've tried), and I can't figure out what they should be. The "Fetch" part of the code above is causing a crash, and I'm guessing its because I'm not getting the node names right.

I won't paste the entire Retinanet summary here, but here is the first few nodes:

    Layer (type)                    Output Shape         Param #     Connected to                     
    ==================================================================================================
    input_1 (InputLayer)            (None, None, None, 3 0                                            
    __________________________________________________________________________________________________
    padding_conv1 (ZeroPadding2D)   (None, None, None, 3 0           input_1[0][0]                    
    __________________________________________________________________________________________________
    conv1 (Conv2D)                  (None, None, None, 6 9408        padding_conv1[0][0]              
    __________________________________________________________________________________________________
    bn_conv1 (BatchNormalization)   (None, None, None, 6 256         conv1[0][0]                      
    __________________________________________________________________________________________________
    conv1_relu (Activation)         (None, None, None, 6 0           bn_conv1[0][0]                   
    __________________________________________________________________________________________________

And here are the last several nodes:

  __________________________________________________________________________________________________
   anchors_0 (Anchors)             (None, None, 4)      0           P3[0][0]                         
   __________________________________________________________________________________________________
   anchors_1 (Anchors)             (None, None, 4)      0           P4[0][0]                         
   __________________________________________________________________________________________________
   anchors_2 (Anchors)             (None, None, 4)      0           P5[0][0]                         
   __________________________________________________________________________________________________
   anchors_3 (Anchors)             (None, None, 4)      0           P6[0][0]                         
   __________________________________________________________________________________________________
   anchors_4 (Anchors)             (None, None, 4)      0           P7[0][0]                         
   __________________________________________________________________________________________________
   regression_submodel (Model)     (None, None, 4)      2443300     P3[0][0]                         
                                                                    P4[0][0]                         
                                                                    P5[0][0]                         
                                                                    P6[0][0]                         
                                                                    P7[0][0]                         
   __________________________________________________________________________________________________
   anchors (Concatenate)           (None, None, 4)      0           anchors_0[0][0]                  
                                                                    anchors_1[0][0]                  
                                                                    anchors_2[0][0]                  
                                                                    anchors_3[0][0]                  
                                                                    anchors_4[0][0]                  
   __________________________________________________________________________________________________
   regression (Concatenate)        (None, None, 4)      0           regression_submodel[1][0]        
                                                                    regression_submodel[2][0]        
                                                                    regression_submodel[3][0]        
                                                                    regression_submodel[4][0]        
                                                                    regression_submodel[5][0]        
   __________________________________________________________________________________________________
   boxes (RegressBoxes)            (None, None, 4)      0           anchors[0][0]                    
                                                                    regression[0][0]                 
   __________________________________________________________________________________________________
   classification_submodel (Model) (None, None, 1)      2381065     P3[0][0]                         
                                                                    P4[0][0]                         
                                                                    P5[0][0]                         
                                                                    P6[0][0]                         
                                                                    P7[0][0]                         
   __________________________________________________________________________________________________
   clipped_boxes (ClipBoxes)       (None, None, 4)      0           input_1[0][0]                    
                                                                    boxes[0][0]                      
   __________________________________________________________________________________________________
   classification (Concatenate)    (None, None, 1)      0           classification_submodel[1][0]    
                                                                    classification_submodel[2][0]    
                                                                    classification_submodel[3][0]    
                                                                    classification_submodel[4][0]    
                                                                    classification_submodel[5][0]    
   __________________________________________________________________________________________________
   filtered_detections (FilterDete [(None, 300, 4), (No 0           clipped_boxes[0][0]              
                                                                    classification[0][0]             
   ==================================================================================================
   Total params: 36,382,957
   Trainable params: 36,276,717
   Non-trainable params: 106,240

Any help with figure out how to fix the "Fetch" part of this would be greatly appreciated.

EDIT:

To dig a little further into this, I found a python function to print the operation names from a .pb file. When doing this for the FRCNN .pb file, it clearly gave the output node names, as can be seen below (only posting the last several lines from the output of the python function).

    import/SecondStagePostprocessor/BatchMultiClassNonMaxSuppression/map/TensorArrayStack_4/TensorArrayGatherV3
    import/SecondStagePostprocessor/ToFloat_1
    import/add/y
    import/add
    import/detection_boxes
    import/detection_scores
    import/detection_classes
    import/num_detections

If I do the same thing for the Retinanet .pb file, it's not obvious what the outputs are. Here's the last several lines from the python function.

  import/filtered_detections/map/while/NextIteration_4
   import/filtered_detections/map/while/Exit_2
   import/filtered_detections/map/while/Exit_3
   import/filtered_detections/map/while/Exit_4
   import/filtered_detections/map/TensorArrayStack/TensorArraySizeV3
   import/filtered_detections/map/TensorArrayStack/range/start
   import/filtered_detections/map/TensorArrayStack/range/delta
   import/filtered_detections/map/TensorArrayStack/range
   import/filtered_detections/map/TensorArrayStack/TensorArrayGatherV3
   import/filtered_detections/map/TensorArrayStack_1/TensorArraySizeV3
   import/filtered_detections/map/TensorArrayStack_1/range/start
   import/filtered_detections/map/TensorArrayStack_1/range/delta
   import/filtered_detections/map/TensorArrayStack_1/range
   import/filtered_detections/map/TensorArrayStack_1/TensorArrayGatherV3
   import/filtered_detections/map/TensorArrayStack_2/TensorArraySizeV3
   import/filtered_detections/map/TensorArrayStack_2/range/start
   import/filtered_detections/map/TensorArrayStack_2/range/delta
   import/filtered_detections/map/TensorArrayStack_2/range
   import/filtered_detections/map/TensorArrayStack_2/TensorArrayGatherV3

For reference, here's the python function that I used:

    def printTensors(pb_file):
    
        # read pb into graph_def
        with tf.gfile.GFile(pb_file, "rb") as f:
            graph_def = tf.GraphDef()
            graph_def.ParseFromString(f.read())
    
        # import graph_def
        with tf.Graph().as_default() as graph:
            tf.import_graph_def(graph_def)
    
        # print operations
        for op in graph.get_operations():
            print(op.name)


Hope this helps.

If I can get this working, I'll gladly share my code for training Retinanet in Keras (which is actually transfer learning on my custom objects) and for running the inference of that model in TensorflowSharp. In my python testing, Retinanet clearly outperforms FRCNN.

rbgreenway avatar Oct 23 '18 21:10 rbgreenway

I have the same problem does anyone knows which operations get the masks and classification results for mrcnn?

oferbentovim avatar Feb 13 '19 09:02 oferbentovim

I was able to figure out the in/out layer names for the Keras Retinanet implementation using a tool that comes with the Tensorflow source called summarize_graph. This may be too much information, but this is basically the process:

  1. get the Tensorflow source from https://github.com/tensorflow/tensorflow
  2. install bazel for your system (this is the build tool needed to build summarize_graph). You may need to find instructions for installing bazel for your distro.
  3. Navigate to the root of the tensorflow source directory, and then run in a terminal
./configure
bazel build tensorflow/tools/graph_transforms:summarize_graph
  1. summarize_graph is located at:

<path/to/TensorflowSource>/tensorflow/bazel-bin/tensorflow/tools/graph_transforms

example run in terminal: /home/bryan/TFSource/tensorflow/bazel-bin/tensorflow/tools/graph_transforms/summarize_graph --in_graph="/home/bryan/retinanet/keras-retinanet/snapshots/test_bryan.pb"

If your .pb file is ready for inference and frozen, then the output will tell you then names of the input and output layers. Here is the relevant part of the output from the command above for a Keras-Retinanet trained network:

Found 1 possible inputs: (name=input_1, type=float(1), shape=[?,?,?,3]) 
No variables spotted.
Found 3 possible outputs: (name=filtered_detections/map/TensorArrayStack/TensorArrayGatherV3, op=TensorArrayGatherV3) (name=filtered_detections/map/TensorArrayStack_1/TensorArrayGatherV3, op=TensorArrayGatherV3) (name=filtered_detections/map/TensorArrayStack_2/TensorArrayGatherV3, op=TensorArrayGatherV3) 

From this, you can see that Input Layer name is "input_1"

Output Layer names are "filtered_detections/map/TensorArrayStack/TensorArrayGatherV3" <-- this is the boxes "filtered_detections/map/TensorArrayStack_1/TensorArrayGatherV3" <-- this is the scores "filtered_detections/map/TensorArrayStack_2/TensorArrayGatherV3" <-- this is the classes

You can see the output layer names are quite complicated (I could have never guessed them).

I know you're working on MRCNN and not RetinaNet, but I hope this helps.

For those interested (judging by the lack of response to my original post, this may be no one), I'll try to put together a complete post of the process for taking the trained .h5 Keras file, converting it to a .pb file, and then using this .pb file with TensorflowSharp. There are lots of little nuances that I had to figure out in order to get it to work properly, but it was worth the effort for me.

rbgreenway avatar Feb 13 '19 18:02 rbgreenway

Hello @rbgreenway , I have the same problem implementing MaskRcnn frozen graph using TensorFlowSharp, can you please post your approach to implement this?

chrigui94 avatar May 03 '20 23:05 chrigui94