tensorflow-yolov4-tflite icon indicating copy to clipboard operation
tensorflow-yolov4-tflite copied to clipboard

infer() failed to predict any bbox after the first call

Open wangyi1177 opened this issue 4 years ago • 24 comments

The 'serving_default' signature infer always returns pred_bbox with shape (1, 0, 84) except in the first call which returns pred_bbox with shape (1, 8, 84). It happens in detectvideo.py and evaluate.py. Tensorflow version is 2.3.0 as required with cuda10.1. image

wangyi1177 avatar Dec 18 '20 09:12 wangyi1177

Facing the exact same issue. Did you find any solution for this? @wangyi1177

cis-apoorv avatar Dec 19 '20 07:12 cis-apoorv

The infer works correctly when running tensorflow on CPU. Pretty sure something wrong when creating the input tensor on GPU : batch = tf.constant(images_data), but not sure why.

wangyi1177 avatar Dec 19 '20 11:12 wangyi1177

I had observed the same thing @wangyi1177.

cis-apoorv avatar Dec 19 '20 13:12 cis-apoorv

Add os.environ["CUDA_VISIBLE_DEVICES"] = '0' or any other gpu id before creating InteractiveSession() can solve the problem. Still, not sure why. It's weird that even set os.environ["CUDA_VISIBLE_DEVICES"] = '', the session still use GPU, not CPU as expected.

wangyi1177 avatar Dec 20 '20 10:12 wangyi1177

Hello @wangyi1177 ! I've got it running by manually saving the weights. You can edit save_model.py and just add: model.save_weights(FLAGS.weights) after: utils.load_weights(model, FLAGS.weights, FLAGS.model, FLAGS.tiny) model.save() is unable to save the weights due to some reason.

cis-apoorv avatar Dec 22 '20 17:12 cis-apoorv

Same here, it works fine when using the coco models. But it doesn't work with custom models trained directly from darknet. Have you figured any way to solve this?

yieniggu avatar Jan 24 '21 01:01 yieniggu

I'm running with GPU enabled on google collab and facing the same problem. Any fixes? @wangyi1177, @cis-apoorv

nishantr05 avatar Jan 30 '21 16:01 nishantr05

facing the same problem. Any fixes? @wangyi1177

kevinhey avatar Feb 03 '21 15:02 kevinhey

which file to Add os.environ["CUDA_VISIBLE_DEVICES"] = '0' ,I have add this in save_model.py and detect.py,but it didn't work @wangyi1177

kevinhey avatar Feb 03 '21 15:02 kevinhey

Hello @yieniggu , @nishantr05 , and @kevinhey Please replace: utils.load_weights(model, FLAGS.weights, FLAGS.model, FLAGS.tiny) in saved_model.py with:

model.save_weights(FLAGS.weights)
model.load_weights(FLAGS.weights)

cis-apoorv avatar Feb 04 '21 05:02 cis-apoorv

@cis-apoorv thanks for your reply. My doubt is that if you entirely remove utils.load_weights(model, FLAGS.weights, FLAGS.model, FLAGS.tiny)

Wouldn't it prohibit us from working with a tiny version?

Also could you please share your version of saved_model.py?

yieniggu avatar Feb 04 '21 05:02 yieniggu

Hi @yieniggu

You are right it will prohibit you to use the tiny version. So instead what you can do is, first save and load the model using:

model.save_weights(FLAGS.weights)
model.load_weights(FLAGS.weights)

and append: utils.load_weights(model, FLAGS.weights, FLAGS.model, FLAGS.tiny) below model.load_weight().

This worked for my tiny-yolov4 model.

cis-apoorv avatar Feb 04 '21 09:02 cis-apoorv

Hi all, neither of the solutions provided here worked for me. The detection only occurs on the first frame of a video when it runs on GPU. No problem when on CPU. Anyone solved this issue? Thanks in advance,

sulebaynes avatar Mar 16 '21 10:03 sulebaynes

I have same issue. "infer in detectvideo.py" doesn't work after first call. I hope anyone solve this problem please

yildbs avatar Mar 21 '21 01:03 yildbs

@cis-apoorv I don't understand you say to explain about that. Is this mean about save_model.py in utils.load_weights code is change to save model.save_weights or model.load_weights? After then model.save_weights or model.load_weights below to append utils.load_weights? I want to use only tensorflow yolov4 model not tiny version.

SKH93 avatar Apr 06 '21 00:04 SKH93

Hello @SKH93, What I mean to say is, in save_model.py file append:

model.save_weights(FLAGS.weights)
model.load_weights(FLAGS.weights)

above: utils.load_weights(model, FLAGS.weights, FLAGS.model, FLAGS.tiny) This works for both yolov4 and tiny-yolov4.

cis-apoorv avatar Apr 06 '21 06:04 cis-apoorv

I think I figured it out. tf.keras.Model.save seems not compatible with tf.saved_model.load Use tf.keras.models.load_model with tf.keras.Model.save, and tf.saved_model.save with tf.saved_model.load

toddwong avatar May 06 '21 06:05 toddwong

@toddwong It takes really long for me and I have warning messages like

WARNING:tensorflow:No training configuration found in save file, so the model was *not* compiled. Compile it manually.
W0630 14:49:52.461997 24320 load.py:171] No training configuration found in save file, so the model was *not* compiled. Compile it manually.

phykurox avatar Jun 30 '21 07:06 phykurox

The above fix slowed prediction on GPU by a lot.

I think I found a way to solve it. I was using a tensorflow 2.4.2 docker image, changing to the tensorflow version specified in the requirements-gpu.txt file (2.3.0rc0-gpu) solved it, ran on GPU, got predictions for the entire video, and fast as usual. I tested predicting on a model converted using the 2.4.2 version and it didnt't work even with predicting with the 2.3.0rc0 version. I had to re-convert the model using TF 2.3.0rc0, and also predict with 2.3.0rc0.

Pedrohgv avatar Aug 07 '21 11:08 Pedrohgv

The above fix slowed prediction on GPU by a lot.

I think I found a way to solve it. I was using a tensorflow 2.4.2 docker image, changing to the tensorflow version specified in the requirements-gpu.txt file (2.3.0rc0-gpu) solved it, ran on GPU, got predictions for the entire video, and fast as usual. I tested predicting on a model converted using the 2.4.2 version and it didnt't work even with predicting with the 2.3.0rc0 version. I had to re-convert the model using TF 2.3.0rc0, and also predict with 2.3.0rc0.

Your solution works for me, thanks!!!

crisptof avatar Sep 14 '21 03:09 crisptof

The above fix slowed prediction on GPU by a lot.

I think I found a way to solve it. I was using a tensorflow 2.4.2 docker image, changing to the tensorflow version specified in the requirements-gpu.txt file (2.3.0rc0-gpu) solved it, ran on GPU, got predictions for the entire video, and fast as usual. I tested predicting on a model converted using the 2.4.2 version and it didnt't work even with predicting with the 2.3.0rc0 version. I had to re-convert the model using TF 2.3.0rc0, and also predict with 2.3.0rc0.

Thanks. It's work for me :D

lehieubkhn avatar Dec 17 '21 02:12 lehieubkhn

I faced the same problem with my RTX A5000. After weeks of debugging, what we found was, models trained on Ampere architecture cannot be run on tensorflow <= 2.4.0 after conversion and by extension on this library as it would give us the above issue. We found a workaround by training the model on colab which uses GPUs of architectures earlier than Ampere which give backward compatibility on CUDA. Then we converted the model using save_model.py using tensorflow 2.3.0 and then that model and the repo ended up working as intended on tensorflow 2.5.0 as well.

rishabhshetty98 avatar Jan 31 '22 06:01 rishabhshetty98

I faced the same problem with my RTX A5000. After weeks of debugging, what we found was, models trained on Ampere architecture cannot be run on tensorflow <= 2.4.0 after conversion and by extension on this library as it would give us the above issue. We found a workaround by training the model on colab which uses GPUs of architectures earlier than Ampere which give backward compatibility on CUDA. Then we converted the model using save_model.py using tensorflow 2.3.0 and then that model and the repo ended up working as intended on tensorflow 2.5.0 as well.

Thanks, it's work. Tested converting model using version tensorflow 2.3.0, then use that model to tensorflow 2.5.0

aproxtimedev avatar Nov 13 '22 06:11 aproxtimedev

tensorflow 2.6.0 detect.py :

pred_bbox = infer(batch_data) #
pred_bbox = infer(batch_data) # this code is not work --> []

XavierChou707 avatar Feb 14 '23 08:02 XavierChou707