mAP icon indicating copy to clipboard operation
mAP copied to clipboard

Explanations concerning multiple detections

Open nicolaschianella opened this issue 4 years ago • 5 comments

Hello @Cartucho and thanks for your amazing work.

I was wondering how you deal with multiple detections (for one ground truth) ; if this GT and the detections share the same label. Which one is considered as a TP ? Highest IoU or highest confidence score ? I found both ways to make this decision and actually it seems that it does not change mAP a lot but actually it might, am I right ?

Have a nice day

nicolaschianella avatar Apr 03 '20 08:04 nicolaschianella

I am also interested in this.

Thanks!

komer94 avatar Apr 06 '20 21:04 komer94

Hello @Cartucho and thanks for your amazing work.

I was wondering how you deal with multiple detections (for one ground truth) ; if this GT and the detections share the same label. Which one is considered as a TP ? Highest IoU or highest confidence score ? I found both ways to make this decision and actually it seems that it does not change mAP a lot but actually it might, am I right ?

Have a nice day

according to read the source code, the highest confidence score is the ground truth

zjZSTU avatar Apr 21 '20 02:04 zjZSTU

Hello @Cartucho and thanks for your amazing work. I was wondering how you deal with multiple detections (for one ground truth) ; if this GT and the detections share the same label. Which one is considered as a TP ? Highest IoU or highest confidence score ? I found both ways to make this decision and actually it seems that it does not change mAP a lot but actually it might, am I right ? Have a nice day

according to read the source code, the highest confidence score is the ground truth

Hi! Can you show the code supporting what you said? Thanks

pauloamed avatar May 12 '20 15:05 pauloamed

Hello @Cartucho and thanks for your amazing work. I was wondering how you deal with multiple detections (for one ground truth) ; if this GT and the detections share the same label. Which one is considered as a TP ? Highest IoU or highest confidence score ? I found both ways to make this decision and actually it seems that it does not change mAP a lot but actually it might, am I right ? Have a nice day

according to read the source code, the highest confidence score is the ground truth

Hi! Can you show the code supporting what you said? Thanks

let me briefly introduce the implementation process

  1. the benchmark IoU is set to judge whether the bounding box is positive or not
MINOVERLAP = 0.5 # default value (defined in the PASCAL VOC2012 challenge)
  1. The detection results of all images are traversed, stored separately according to categories, and sorted according to confidence score (Ranking from high to low).
############# 465 - 495
for class_index, class_name in enumerate(gt_classes):
    。。。
    for txt_file in dr_files_list:
        。。。
        lines = file_lines_to_list(txt_file)
        。。。
    # sort detection-results by decreasing confidence
    bounding_boxes.sort(key=lambda x:float(x['confidence']), reverse=True)
    with open(TEMP_FILES_PATH + "/" + class_name + "_dr.json", 'w') as outfile:
        json.dump(bounding_boxes, outfile)
  1. Respectively traverse the detection results of each category, find GTs of its corresponding file, and find the largest IoU result
################ 504 - 568
with open(output_files_path + "/output.txt", 'w') as output_file:
    output_file.write("# AP and precision/recall per class\n")
    count_true_positives = {}
    for class_index, class_name in enumerate(gt_classes):
        。。。
        for idx, detection in enumerate(dr_data):
            file_id = detection["file_id"]
            。。。
            # assign detection-results to ground truth object if any
            # open ground-truth with that file_id
            gt_file = TEMP_FILES_PATH + "/" + file_id + "_ground_truth.json"
            ground_truth_data = json.load(open(gt_file))
            ovmax = -1
            gt_match = -1
            # load detected object bounding-box
            bb = [ float(x) for x in detection["bbox"].split() ]
            for obj in ground_truth_data:
                # look for a class_name match
                if obj["class_name"] == class_name:
                    bbgt = [ float(x) for x in obj["bbox"].split() ]
                    bi = [max(bb[0],bbgt[0]), max(bb[1],bbgt[1]), min(bb[2],bbgt[2]), min(bb[3],bbgt[3])]
                    iw = bi[2] - bi[0] + 1
                    ih = bi[3] - bi[1] + 1
                    if iw > 0 and ih > 0:
                        # compute overlap (IoU) = area of intersection / area of union
                        ua = (bb[2] - bb[0] + 1) * (bb[3] - bb[1] + 1) + (bbgt[2] - bbgt[0]
                                        + 1) * (bbgt[3] - bbgt[1] + 1) - iw * ih
                        ov = iw * ih / ua
                        if ov > ovmax:
                            ovmax = ov
                            gt_match = obj
  1. Judge whether the highest IoU exceeds MINOVERLAP
########## 579
            if ovmax >= min_overlap:
  1. Judging whether the corresponding GT has been used (each GT only corresponds to one positive sample). If not, set the prediction bounding box to TP
################ 580 - 595
                if "difficult" not in gt_match:
                        if not bool(gt_match["used"]):
                            # true positive
                            tp[idx] = 1
                            gt_match["used"] = True
                            count_true_positives[class_name] += 1
                            # update the ".json" file
                            with open(gt_file, 'w') as f:
                                    f.write(json.dumps(ground_truth_data))
                            if show_animation:
                                status = "MATCH!"
                        else:
                            # false positive (multiple detection)
                            fp[idx] = 1
                            if show_animation:
                                status = "REPEATED MATCH!"

zjZSTU avatar May 13 '20 02:05 zjZSTU

Sorry for the late response, I only saw this now. @zjZSTU thank you so much for your help!

Basically, you first sort your detections from higher to lower confidence score. And then you try to assign each of these detections to a ground-truth object. If the class is correct and the IOU > 0.5 then you have a true positive (a "MATCH!") and that ground-truth object is marked as used.

Since there can only be one detection for each ground-truth object the following ones (with lower confidence score) even if they are correct they will be marked as a False Positive since it is a "REPEATED MATCH!".

Cartucho avatar May 13 '20 14:05 Cartucho