TensorFlow2.0-Examples icon indicating copy to clipboard operation
TensorFlow2.0-Examples copied to clipboard

Yolov3 slow?

Open yuntai opened this issue 5 years ago • 8 comments

with video_demo.py about 20% speed compared to your 1.0 repo. but thanks much for sharing!

yuntai avatar Jul 12 '19 14:07 yuntai

please install tensorflow-gpu !!!

YunYang1994 avatar Jul 12 '19 14:07 YunYang1994

maybe it could be faster if you use frozen graph ".pb". I am not very sure about it. I will continuously update this repo, welcome to watch it !

YunYang1994 avatar Jul 13 '19 01:07 YunYang1994

in utils.load_weights() got valueError: No such layer: batch_normalization_v2 with 2.0.0-beta1 w/o _v2 it works fine

yuntai avatar Jul 13 '19 17:07 yuntai

Thank you. I fixed it just now.

YunYang1994 avatar Jul 14 '19 10:07 YunYang1994

pred_bbox = model.predict(image_data) is much faster; not as fast as your tf1 repo though.

model(x) vs. model.predict(x) When calling model(x) directly, we are executing the graph in eager mode. For model.predict, tf actually compiles the graph on the first run and then execute in graph mode. So if you are only running the model once, model(x) is faster since there is no compilation needed. Otherwise, model.predict or using exported SavedModel graph is much faster (by 2x).

from this

yuntai avatar Jul 20 '19 22:07 yuntai

Thanks a lot for your valuable information

YunYang1994 avatar Jul 21 '19 13:07 YunYang1994

this gives a bit of speed-up. very roughly ~ 20 fps to ~ 30 fps on ti1080ti.

feature_maps = YOLOv3(input_layer)

@tf.function
def build(feature_maps):
    bbox_tensors = []
    for i, fm in enumerate(feature_maps):
        bbox_tensor = decode(fm, i)
        bbox_tensors.append(tf.reshape(bbox_tensor, (-1, 5+num_classes)))
    bbox_tensors = tf.concat(bbox_tensors, axis=0)
    return bbox_tensors

bbox_tensors = build(feature_maps)
model = tf.keras.Model(input_layer, bbox_tensors)

Think I will come back this speed issue when (non-beta) v2.0 is released.

BTW, I found a small optimization in 'postprocess_boxes()' where we can filter with score_mask first to significantly reduce the number of rows to be processed in the following. perhaps a couple of fps gain! :)

yuntai avatar Jul 29 '19 15:07 yuntai

for some reason predict_on_batch(image) is much faster! (almost twice). tried predict(image, batch_size=1) but still slow. with this & tf.function above I think, now, the speed is par with that of your tf1 repo. congrats & thanks!

I had a problem with tf.function with official tf2.0-beta-gpu. but with my own custom 2.0 build (don't know which source commit I used) it works fine. I think it's correct usage and okay when the release version comes out.

yuntai avatar Aug 03 '19 16:08 yuntai