BigDL-2.x When I use the cluster-serving to inference face-detection model by streaming, the result is wrong.

When I use the cluster-serving to inference face-detection model by streaming, the result is wrong. When I write one data , then time.sleep(3) , the result is right.

Mar 08 '21 03:03 gtfaiwxm

Hi @gtfaiwxm, as we discussed, please paste more info about the model and the input/output. thanks.

Mar 08 '21 07:03 glorysdj

Hi, @glorysdj , the face-detection-0100 model is downloaded from openvino-model zoo(2020.2 version) . The net outputs a blob with shape: [1, 1, N, 7], where N is the number of detected bounding boxes. For each detection, the description has the format: [image_id, label, conf, x_min, y_min, x_max, y_max], where:

image_id - ID of the image in the batch
label - predicted class ID
conf - confidence for the predicted class
(x_min, y_min) - coordinates of the top left bounding box corner
(x_max, y_max) - coordinates of the bottom right bounding box corner. When I keep writing an image data, the result : {'image-4-test_1.jpg': array([[[14. , 1. , 0.02250869, ..., 0. , 0. , 0. ], [14. , 1. , 0.02248706, ..., 0. , 0. , 0. ], [14. , 1. , 0.022426 , ..., 0. , 0. , 0. ], ..., [15. , 1. , 0.02205388, ..., 0. , 0. , 0. ], [15. , 1. , 0.02199526, ..., 0. , 0. , 0. ], [15. , 1. , 0.02189598, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-1-test_1.jpg': array([[[0.0000000e+00, 1.0000000e+00, 9.9770606e-01, ..., 5.1740110e-03, 7.5382727e-01, 8.2223165e-01], [1.0000000e+00, 1.0000000e+00, 9.9770606e-01, ..., 0.0000000e+00, 0.0000000e+00, 0.0000000e+00], [1.0000000e+00, 1.0000000e+00, 9.9604505e-01, ..., 0.0000000e+00, 0.0000000e+00, 0.0000000e+00], ..., [1.0000000e+01, 1.0000000e+00, 9.5114928e-01, ..., 0.0000000e+00, 0.0000000e+00, 0.0000000e+00], [1.0000000e+01, 1.0000000e+00, 9.4461030e-01, ..., 0.0000000e+00, 0.0000000e+00, 0.0000000e+00], [1.0000000e+01, 1.0000000e+00, 9.3502581e-01, ..., 0.0000000e+00, 0.0000000e+00, 0.0000000e+00]]], dtype=float32), 'image-9-test_1.jpg': array([[[20. , 1. , 0.02957392, ..., 0. , 0. , 0. ], [20. , 1. , 0.02924488, ..., 0. , 0. , 0. ], [20. , 1. , 0.02915918, ..., 0. , 0. , 0. ], ..., [21. , 1. , 0.02574502, ..., 0. , 0. , 0. ], [21. , 1. , 0.02572953, ..., 0. , 0. , 0. ], [21. , 1. , 0.02556236, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-16-test_1.jpg': array([[[13. , 1. , 0.0230323 , ..., 0. , 0. , 0. ], [13. , 1. , 0.02297049, ..., 0. , 0. , 0. ], [13. , 1. , 0.02296207, ..., 0. , 0. , 0. ], ..., [14. , 1. , 0.02266836, ..., 0. , 0. , 0. ], [14. , 1. , 0.02261219, ..., 0. , 0. , 0. ], [14. , 1. , 0.02260978, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-11-test_1.jpg': array([[[22. , 1. , 0.02645806, ..., 0. , 0. , 0. ], [22. , 1. , 0.02644061, ..., 0. , 0. , 0. ], [22. , 1. , 0.02641329, ..., 0. , 0. , 0. ], ..., [23. , 1. , 0.02858338, ..., 0. , 0. , 0. ], [23. , 1. , 0.02836005, ..., 0. , 0. , 0. ], [23. , 1. , 0.02832798, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-15-test_1.jpg': array([[[10. , 1. , 0.93154913, ..., 0. , 0. , 0. ], [10. , 1. , 0.8998288 , ..., 0. , 0. , 0. ], [10. , 1. , 0.8828549 , ..., 0. , 0. , 0. ], ..., [13. , 1. , 0.02323997, ..., 0. , 0. , 0. ], [13. , 1. , 0.02311641, ..., 0. , 0. , 0. ], [13. , 1. , 0.02307337, ..., 0. , 0. , 0. ]]], dtype=float32)}. When I write an image data by time.sleep(3), the results are: {'image-4-test_1.jpg': array([[[0. , 1. , 0.99770606, ..., 0.00517401, 0.7538273 , 0.82223165], [1. , 1. , 0.12527953, ..., 0. , 0. , 0. ], [1. , 1. , 0.11860519, ..., 0. , 0. , 0. ], ..., [2. , 1. , 0.10363458, ..., 0. , 0. , 0. ], [2. , 1. , 0.0918026 , ..., 0. , 0. , 0. ], [2. , 1. , 0.08697815, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-1-test_1.jpg': array([[[0. , 1. , 0.99770606, ..., 0.00517401, 0.7538273 , 0.82223165], [1. , 1. , 0.12527953, ..., 0. , 0. , 0. ], [1. , 1. , 0.11860519, ..., 0. , 0. , 0. ], ..., [2. , 1. , 0.10363458, ..., 0. , 0. , 0. ], [2. , 1. , 0.0918026 , ..., 0. , 0. , 0. ], [2. , 1. , 0.08697815, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-9-test_1.jpg': array([[[0. , 1. , 0.99770606, ..., 0.00517401, 0.7538273 , 0.82223165], [1. , 1. , 0.12527953, ..., 0. , 0. , 0. ], [1. , 1. , 0.11860519, ..., 0. , 0. , 0. ], ..., [2. , 1. , 0.10363458, ..., 0. , 0. , 0. ], [2. , 1. , 0.0918026 , ..., 0. , 0. , 0. ], [2. , 1. , 0.08697815, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-16-test_1.jpg': array([[[0. , 1. , 0.99770606, ..., 0.00517401, 0.7538273 , 0.82223165], [1. , 1. , 0.12527953, ..., 0. , 0. , 0. ], [1. , 1. , 0.11860519, ..., 0. , 0. , 0. ], ..., [2. , 1. , 0.10363458, ..., 0. , 0. , 0. ], [2. , 1. , 0.0918026 , ..., 0. , 0. , 0. ], [2. , 1. , 0.08697815, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-11-test_1.jpg': array([[[0. , 1. , 0.99770606, ..., 0.00517401, 0.7538273 , 0.82223165], [1. , 1. , 0.12527953, ..., 0. , 0. , 0. ], [1. , 1. , 0.11860519, ..., 0. , 0. , 0. ], ..., [2. , 1. , 0.10363458, ..., 0. , 0. , 0. ], [2. , 1. , 0.0918026 , ..., 0. , 0. , 0. ], [2. , 1. , 0.08697815, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-15-test_1.jpg': array([[[0. , 1. , 0.99770606, ..., 0.00517401, 0.7538273 , 0.82223165], [1. , 1. , 0.12527953, ..., 0. , 0. , 0. ], [1. , 1. , 0.11860519, ..., 0. , 0. , 0. ], ..., [2. , 1. , 0.10363458, ..., 0. , 0. , 0. ], [2. , 1. , 0.0918026 , ..., 0. , 0. , 0. ], [2. , 1. , 0.08697815, ..., 0. , 0. , 0. ]]], dtype=float32), 'image-17-test_1.jpg': array([[[0. , 1. , 0.99770606, ..., 0.00517401, 0.7538273 , 0.82223165], [1. , 1. , 0.12527953, ..., 0. , 0. , 0. ], [1. , 1. , 0.11860519, ..., 0. , 0. , 0. ], ..., [2. , 1. , 0.10363458, ..., 0. , 0. , 0. ], [2. , 1. , 0.0918026 , ..., 0. , 0. , 0. ], [2. , 1. , 0.08697815, ..., 0. , 0. , 0. ]]], dtype=float32)}

Mar 09 '21 02:03 gtfaiwxm

we have reproduced this error, and now trying to fix it.

Mar 10 '21 02:03 glorysdj

Hi, @glorysdj .Has this problem been resolved?

Mar 17 '21 01:03 gtfaiwxm

Hi @gtfaiwxm, we are testing the fix now. The fix will be merged soon. Thanks.

Mar 18 '21 07:03 glorysdj

Hi @gtfaiwxm , we tested that this model would not predict correctly with batch in Cluster Serving while others could, so we add an entry in the config.yaml so that users could choose to disable the batch inference so as to resolve the problem.

Mar 19 '21 14:03 KimiLiuQh

@gtfaiwxm We are currently fixing this issue. This is an issue may well locates in Analytics Zoo core and may need some time to confirm.

To work around, you could set core_num: 1 in config file. This would get some drop of performance, but the result would be right.

Mar 23 '21 06:03 Litchilitchy

This bug is fixed at #3690 , you could try it again in nightly version.

Mar 31 '21 02:03 Litchilitchy

@gtfaiwxm any more questions on the issue? if no , we may close it soon.

Nov 25 '21 06:11 helenlly

BigDL-2.x BigDL-2.x copied to clipboard

When I use the cluster-serving to inference face-detection model by streaming, the result is wrong.

BigDL-2.x
BigDL-2.x copied to clipboard