BigDL-2.x icon indicating copy to clipboard operation
BigDL-2.x copied to clipboard

When I use the cluster-serving to inference face-detection model by streaming, the result is wrong.

Open gtfaiwxm opened this issue 3 years ago • 9 comments

When I use the cluster-serving to inference face-detection model by streaming, the result is wrong. When I write one data , then time.sleep(3) , the result is right.

gtfaiwxm avatar Mar 08 '21 03:03 gtfaiwxm

Hi @gtfaiwxm, as we discussed, please paste more info about the model and the input/output. thanks.

glorysdj avatar Mar 08 '21 07:03 glorysdj

Hi, @glorysdj , the face-detection-0100 model is downloaded from openvino-model zoo(2020.2 version) . The net outputs a blob with shape: [1, 1, N, 7], where N is the number of detected bounding boxes. For each detection, the description has the format: [image_id, label, conf, x_min, y_min, x_max, y_max], where:

  • image_id - ID of the image in the batch
  • label - predicted class ID
  • conf - confidence for the predicted class
  • (x_min, y_min) - coordinates of the top left bounding box corner
  • (x_max, y_max) - coordinates of the bottom right bounding box corner. When I keep writing an image data, the result : {'image-4-test_1.jpg': array([[[14. , 1. , 0.02250869, ..., 0. , 0. , 0. ], [14. , 1. , 0.02248706, ..., 0. , 0. , 0. ], [14. , 1. , 0.022426 , ..., 0. , 0. , 0. ], ..., [15. , 1. , 0.02205388, ..., 0. , 0. , 0. ], [15. , 1. , 0.02199526, ..., 0. , 0. , 0. ], [15. , 1. , 0.02189598, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-1-test_1.jpg': array([[[0.0000000e+00, 1.0000000e+00, 9.9770606e-01, ..., 5.1740110e-03, 7.5382727e-01, 8.2223165e-01], [1.0000000e+00, 1.0000000e+00, 9.9770606e-01, ..., 0.0000000e+00, 0.0000000e+00, 0.0000000e+00], [1.0000000e+00, 1.0000000e+00, 9.9604505e-01, ..., 0.0000000e+00, 0.0000000e+00, 0.0000000e+00], ..., [1.0000000e+01, 1.0000000e+00, 9.5114928e-01, ..., 0.0000000e+00, 0.0000000e+00, 0.0000000e+00], [1.0000000e+01, 1.0000000e+00, 9.4461030e-01, ..., 0.0000000e+00, 0.0000000e+00, 0.0000000e+00], [1.0000000e+01, 1.0000000e+00, 9.3502581e-01, ..., 0.0000000e+00, 0.0000000e+00, 0.0000000e+00]]], dtype=float32), 'image-9-test_1.jpg': array([[[20. , 1. , 0.02957392, ..., 0. , 0. , 0. ], [20. , 1. , 0.02924488, ..., 0. , 0. , 0. ], [20. , 1. , 0.02915918, ..., 0. , 0. , 0. ], ..., [21. , 1. , 0.02574502, ..., 0. , 0. , 0. ], [21. , 1. , 0.02572953, ..., 0. , 0. , 0. ], [21. , 1. , 0.02556236, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-16-test_1.jpg': array([[[13. , 1. , 0.0230323 , ..., 0. , 0. , 0. ], [13. , 1. , 0.02297049, ..., 0. , 0. , 0. ], [13. , 1. , 0.02296207, ..., 0. , 0. , 0. ], ..., [14. , 1. , 0.02266836, ..., 0. , 0. , 0. ], [14. , 1. , 0.02261219, ..., 0. , 0. , 0. ], [14. , 1. , 0.02260978, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-11-test_1.jpg': array([[[22. , 1. , 0.02645806, ..., 0. , 0. , 0. ], [22. , 1. , 0.02644061, ..., 0. , 0. , 0. ], [22. , 1. , 0.02641329, ..., 0. , 0. , 0. ], ..., [23. , 1. , 0.02858338, ..., 0. , 0. , 0. ], [23. , 1. , 0.02836005, ..., 0. , 0. , 0. ], [23. , 1. , 0.02832798, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-15-test_1.jpg': array([[[10. , 1. , 0.93154913, ..., 0. , 0. , 0. ], [10. , 1. , 0.8998288 , ..., 0. , 0. , 0. ], [10. , 1. , 0.8828549 , ..., 0. , 0. , 0. ], ..., [13. , 1. , 0.02323997, ..., 0. , 0. , 0. ], [13. , 1. , 0.02311641, ..., 0. , 0. , 0. ], [13. , 1. , 0.02307337, ..., 0. , 0. , 0. ]]], dtype=float32)}. When I write an image data by time.sleep(3), the results are: {'image-4-test_1.jpg': array([[[0. , 1. , 0.99770606, ..., 0.00517401, 0.7538273 , 0.82223165], [1. , 1. , 0.12527953, ..., 0. , 0. , 0. ], [1. , 1. , 0.11860519, ..., 0. , 0. , 0. ], ..., [2. , 1. , 0.10363458, ..., 0. , 0. , 0. ], [2. , 1. , 0.0918026 , ..., 0. , 0. , 0. ], [2. , 1. , 0.08697815, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-1-test_1.jpg': array([[[0. , 1. , 0.99770606, ..., 0.00517401, 0.7538273 , 0.82223165], [1. , 1. , 0.12527953, ..., 0. , 0. , 0. ], [1. , 1. , 0.11860519, ..., 0. , 0. , 0. ], ..., [2. , 1. , 0.10363458, ..., 0. , 0. , 0. ], [2. , 1. , 0.0918026 , ..., 0. , 0. , 0. ], [2. , 1. , 0.08697815, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-9-test_1.jpg': array([[[0. , 1. , 0.99770606, ..., 0.00517401, 0.7538273 , 0.82223165], [1. , 1. , 0.12527953, ..., 0. , 0. , 0. ], [1. , 1. , 0.11860519, ..., 0. , 0. , 0. ], ..., [2. , 1. , 0.10363458, ..., 0. , 0. , 0. ], [2. , 1. , 0.0918026 , ..., 0. , 0. , 0. ], [2. , 1. , 0.08697815, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-16-test_1.jpg': array([[[0. , 1. , 0.99770606, ..., 0.00517401, 0.7538273 , 0.82223165], [1. , 1. , 0.12527953, ..., 0. , 0. , 0. ], [1. , 1. , 0.11860519, ..., 0. , 0. , 0. ], ..., [2. , 1. , 0.10363458, ..., 0. , 0. , 0. ], [2. , 1. , 0.0918026 , ..., 0. , 0. , 0. ], [2. , 1. , 0.08697815, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-11-test_1.jpg': array([[[0. , 1. , 0.99770606, ..., 0.00517401, 0.7538273 , 0.82223165], [1. , 1. , 0.12527953, ..., 0. , 0. , 0. ], [1. , 1. , 0.11860519, ..., 0. , 0. , 0. ], ..., [2. , 1. , 0.10363458, ..., 0. , 0. , 0. ], [2. , 1. , 0.0918026 , ..., 0. , 0. , 0. ], [2. , 1. , 0.08697815, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-15-test_1.jpg': array([[[0. , 1. , 0.99770606, ..., 0.00517401, 0.7538273 , 0.82223165], [1. , 1. , 0.12527953, ..., 0. , 0. , 0. ], [1. , 1. , 0.11860519, ..., 0. , 0. , 0. ], ..., [2. , 1. , 0.10363458, ..., 0. , 0. , 0. ], [2. , 1. , 0.0918026 , ..., 0. , 0. , 0. ], [2. , 1. , 0.08697815, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-17-test_1.jpg': array([[[0. , 1. , 0.99770606, ..., 0.00517401, 0.7538273 , 0.82223165], [1. , 1. , 0.12527953, ..., 0. , 0. , 0. ], [1. , 1. , 0.11860519, ..., 0. , 0. , 0. ], ..., [2. , 1. , 0.10363458, ..., 0. , 0. , 0. ], [2. , 1. , 0.0918026 , ..., 0. , 0. , 0. ], [2. , 1. , 0.08697815, ..., 0. , 0. , 0. ]]], dtype=float32)}

gtfaiwxm avatar Mar 09 '21 02:03 gtfaiwxm

we have reproduced this error, and now trying to fix it.

glorysdj avatar Mar 10 '21 02:03 glorysdj

Hi, @glorysdj .Has this problem been resolved?

gtfaiwxm avatar Mar 17 '21 01:03 gtfaiwxm

Hi @gtfaiwxm, we are testing the fix now. The fix will be merged soon. Thanks.

glorysdj avatar Mar 18 '21 07:03 glorysdj

Hi @gtfaiwxm , we tested that this model would not predict correctly with batch in Cluster Serving while others could, so we add an entry in the config.yaml so that users could choose to disable the batch inference so as to resolve the problem.

KimiLiuQh avatar Mar 19 '21 14:03 KimiLiuQh

@gtfaiwxm We are currently fixing this issue. This is an issue may well locates in Analytics Zoo core and may need some time to confirm.

To work around, you could set core_num: 1 in config file. This would get some drop of performance, but the result would be right.

Litchilitchy avatar Mar 23 '21 06:03 Litchilitchy

This bug is fixed at #3690 , you could try it again in nightly version.

Litchilitchy avatar Mar 31 '21 02:03 Litchilitchy

@gtfaiwxm any more questions on the issue? if no , we may close it soon.

helenlly avatar Nov 25 '21 06:11 helenlly