caffe
caffe copied to clipboard
Noise in NMS threshold filtering (GetMaxScoreIndex)
Threshold driven filtering in a fixed-size array leads to noisy (default) pair values at the end of a vector which results in low confidence noisy output.
This is definitely a good bug fix to merge. @hshen14 is there any reason it's not merged?
thanks for the reminder, we will merge this update to next release.
@MinhazPalasara @matt-ny I just reviewed the fix. I have a question about it: the default pair values in the vector are all 0. why using .at() has problem?
for example: org: 0, 0.1, 0, 0.2, 0 chg: 0.1, 0.2
after sorting, they should all get same output. do I miss something?
Thanks for looking into this! We appreciate it.
Let's say there are j
valid detections with a score > threshold
at the start of GetMaxScoreIndex
, and j < top_k
Then the sorted score_index_vec
, when returning at line 2278, https://github.com/intel/caffe/pull/261/files#diff-f52cd447b1a3377020239892ba7616b3R2278 will have the j
valid detections sorted at the start, followed by top_k - j
default-initialized pairs of score:0.f, index:0 at the end of the vector.
Then, at line 2444, after the first j
detections have been processed, the loop will process the 0,0 pairs: idx
becomes 0, and if the box of index 0 does not have above-threshold JaccardOverlap with another result, it gets pushed into the output array indices
at line 2456. This results typically in a spurious detection in the upper left of the image with a score that's below the threshold being reported.
Another way to fix is to count the number of above-threshold detections trueDetectCount
and set top_k to the min(top_k, trueDetectCount) at line 2294, however, the proposed PR is stylistically consistent with the templated implementation of GetMaxScoreIndex
at line 2286, and looking in the history and Wei Liu's original code, line 2266 used to use push_back but was changed, so this PR is really just backing out this issue.
Thanks for your elaboration, the idx 0 is used wrongly with original code. I will merge your fix to our repo.
@matt-ny this fix would make unit test fail. you can reproduce it by "cmake .. && make test.testbin && ./test/test.testbin --gtest_filter=*NMS*".
Thanks for your reply. I ran the tests as you described, but they did not fail (See attached log). NMStest.log
Is there anything else I need to do besides cmake .. && make test.testbin && ./test/test.testbin --gtest_filter=*NMS*
I think I have known what happened after debugging. we combined vector push_back() with openmp parallel for primitive without protect, it will bring issue at line 2266.(or wrong value or free an unallocate buffer)
we should add openmp critical primitive on push_back().
I'm afraid I'm not familiar with openmp and what you're describing here; is the fix to protect push_back() something you will do, or can you make a suggestion how?
What needs to be done to move forward @ftian1 ? thanks very much!
I have made corresponding changes in local to pass test. your PR will be merged to next release. Thanks