caffe icon indicating copy to clipboard operation
caffe copied to clipboard

Noise in NMS threshold filtering (GetMaxScoreIndex)

Open MinhazPalasara opened this issue 6 years ago • 11 comments

Threshold driven filtering in a fixed-size array leads to noisy (default) pair values at the end of a vector which results in low confidence noisy output.

MinhazPalasara avatar Jan 30 '19 00:01 MinhazPalasara

This is definitely a good bug fix to merge. @hshen14 is there any reason it's not merged?

matt-ny avatar Mar 11 '19 19:03 matt-ny

thanks for the reminder, we will merge this update to next release.

ftian1 avatar Mar 13 '19 01:03 ftian1

@MinhazPalasara @matt-ny I just reviewed the fix. I have a question about it: the default pair values in the vector are all 0. why using .at() has problem?

for example: org: 0, 0.1, 0, 0.2, 0 chg: 0.1, 0.2

after sorting, they should all get same output. do I miss something?

ftian1 avatar Mar 13 '19 03:03 ftian1

Thanks for looking into this! We appreciate it.

Let's say there are j valid detections with a score > threshold at the start of GetMaxScoreIndex, and j < top_k

Then the sorted score_index_vec, when returning at line 2278, https://github.com/intel/caffe/pull/261/files#diff-f52cd447b1a3377020239892ba7616b3R2278 will have the j valid detections sorted at the start, followed by top_k - j default-initialized pairs of score:0.f, index:0 at the end of the vector.

Then, at line 2444, after the first j detections have been processed, the loop will process the 0,0 pairs: idx becomes 0, and if the box of index 0 does not have above-threshold JaccardOverlap with another result, it gets pushed into the output array indices at line 2456. This results typically in a spurious detection in the upper left of the image with a score that's below the threshold being reported.

Another way to fix is to count the number of above-threshold detections trueDetectCount and set top_k to the min(top_k, trueDetectCount) at line 2294, however, the proposed PR is stylistically consistent with the templated implementation of GetMaxScoreIndex at line 2286, and looking in the history and Wei Liu's original code, line 2266 used to use push_back but was changed, so this PR is really just backing out this issue.

matt-ny avatar Mar 14 '19 03:03 matt-ny

Thanks for your elaboration, the idx 0 is used wrongly with original code. I will merge your fix to our repo.

ftian1 avatar Mar 14 '19 05:03 ftian1

@matt-ny this fix would make unit test fail. you can reproduce it by "cmake .. && make test.testbin && ./test/test.testbin --gtest_filter=*NMS*".

ftian1 avatar Mar 15 '19 07:03 ftian1

Thanks for your reply. I ran the tests as you described, but they did not fail (See attached log). NMStest.log

Is there anything else I need to do besides cmake .. && make test.testbin && ./test/test.testbin --gtest_filter=*NMS*

matt-ny avatar Mar 15 '19 15:03 matt-ny

I think I have known what happened after debugging. we combined vector push_back() with openmp parallel for primitive without protect, it will bring issue at line 2266.(or wrong value or free an unallocate buffer)

we should add openmp critical primitive on push_back().

ftian1 avatar Mar 15 '19 16:03 ftian1

I'm afraid I'm not familiar with openmp and what you're describing here; is the fix to protect push_back() something you will do, or can you make a suggestion how?

matt-ny avatar Mar 15 '19 17:03 matt-ny

What needs to be done to move forward @ftian1 ? thanks very much!

matt-ny avatar Apr 02 '19 19:04 matt-ny

I have made corresponding changes in local to pass test. your PR will be merged to next release. Thanks

ftian1 avatar Apr 03 '19 01:04 ftian1