Flow-Guided-Feature-Aggregation icon indicating copy to clipboard operation
Flow-Guided-Feature-Aggregation copied to clipboard

How to generate gt_motion_iou file?

Open moothes opened this issue 5 years ago • 5 comments

I notice that lib/dataset/imagenet_vid_groundtruth_motion_iou.mat contains ious between frames on val set. For some reasons, I want to generate similar file on train set. Any code to generate this file? or it's annotated by hand?

moothes avatar Jul 10 '18 03:07 moothes

Hello, have you solve the problem? I want to finetune my dataset.

liangxi627 avatar Sep 18 '18 02:09 liangxi627

@liangxi627 Hello, I haven't solve this problem. I guess it's annotated by people rather than generated by code.

moothes avatar Sep 18 '18 12:09 moothes

Hi guys,

The FGFA paper suggests averaging over +-10 frames for each object instance. I have implemented a script which can do this for you (it's not very fast, but it produces the correct results):

import numpy as np
import scipy.io as sio

def boxoverlap(bb, bbgt):
    ov = 0
    iw = np.min((bb[2],bbgt[2])) - np.max((bb[0],bbgt[0])) + 1
    ih = np.min((bb[3],bbgt[3])) - np.max((bb[1],bbgt[1])) + 1
    if iw>0 and ih>0:
        # compute overlap as area of intersection / area of union
        intersect = iw * ih
        ua = (bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) + \
               (bbgt[2] - bbgt[0] + 1.) * \
               (bbgt[3] - bbgt[1] + 1.) - intersect
        ov = intersect / ua
    return ov

# dataset is a list of videos, which are lists of frames that are lists of boxes ([xmin, ymin, xmax, ymax, cls, track]):
# eg for the val dataset:
# dataset = [ #vid ILSVRC2015_val_00000000>[#frame 000000>[[606, 417, 172, 7, 26, 0]], ... ], ...]

motion_iou = sio.loadmat('imagenet_vid_groundtruth_motion_iou.mat')
motion_iou = np.array(
    [[motion_iou['motion_iou'][i][0][j][0] if len(motion_iou['motion_iou'][i][0][j]) != 0 else 0 \
      for j in range(len(motion_iou['motion_iou'][i][0]))] \
     for i in range(len(motion_iou['motion_iou']))])

all_motion_iou = np.concatenate(motion_iou, axis=0)

all_ious = []
c = 0
for video in enumerate(dataset):
    for frame in range(len(video)):
        frame_ious = []
        for box_idx in range(len(video[frame])):
            trk_id = video[frame][box_idx][5]
            if trk_id > -1:
                ious = []
                for i in range(-10, 11):
                    frame_c = frame + i
                    if 0 <= frame_c < len(video) and i != 0:
                        for c_box_idx in range(len(video[frame_c])):
                            c_trk_id = video[frame_c][c_box_idx][5]
                            if trk_id == c_trk_id:
                                a = video[frame][box_idx]
                                b = video[frame_c][c_box_idx]
                                ious.append(boxoverlap(a, b))
                                break

                if np.abs(np.mean(ious) - all_motion_iou[c]) > .01:
                    print('They dont match {} {} {}'.format(len(all_ious), np.mean(ious), all_motion_iou[c]))
                frame_ious.append(np.mean(ious))
                c += 1
        if frame_ious:  # if this frame has no boxes we add a 0.0
            all_ious.append(frame_ious)
        else:
            all_ious.append([0.0])
            c += 1

all_ious = np.array(all_ious)
print(len(all_motion_iou))
print(len(np.concatenate(all_ious, axis=0)))

Hope this helps :)

HaydenFaulkner avatar Jul 08 '19 12:07 HaydenFaulkner

I notice that lib/dataset/imagenet_vid_groundtruth_motion_iou.mat contains ious between frames on val set. For some reasons, I want to generate similar file on train set. Any code to generate this file? or it's annotated by hand?

Have you you solve the problem? I also think the iou is not calculated in the code.

blueeda avatar Nov 10 '19 13:11 blueeda

Hi guys,

The FGFA paper suggests averaging over +-10 frames for each object instance. I have implemented a script which can do this for you (it's not very fast, but it produces the correct results):

import numpy as np
import scipy.io as sio

def boxoverlap(bb, bbgt):
    ov = 0
    iw = np.min((bb[2],bbgt[2])) - np.max((bb[0],bbgt[0])) + 1
    ih = np.min((bb[3],bbgt[3])) - np.max((bb[1],bbgt[1])) + 1
    if iw>0 and ih>0:
        # compute overlap as area of intersection / area of union
        intersect = iw * ih
        ua = (bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) + \
               (bbgt[2] - bbgt[0] + 1.) * \
               (bbgt[3] - bbgt[1] + 1.) - intersect
        ov = intersect / ua
    return ov

# dataset is a list of videos, which are lists of frames that are lists of boxes ([xmin, ymin, xmax, ymax, cls, track]):
# eg for the val dataset:
# dataset = [ #vid ILSVRC2015_val_00000000>[#frame 000000>[[606, 417, 172, 7, 26, 0]], ... ], ...]

motion_iou = sio.loadmat('imagenet_vid_groundtruth_motion_iou.mat')
motion_iou = np.array(
    [[motion_iou['motion_iou'][i][0][j][0] if len(motion_iou['motion_iou'][i][0][j]) != 0 else 0 \
      for j in range(len(motion_iou['motion_iou'][i][0]))] \
     for i in range(len(motion_iou['motion_iou']))])

all_motion_iou = np.concatenate(motion_iou, axis=0)

all_ious = []
c = 0
for video in enumerate(dataset):
    for frame in range(len(video)):
        frame_ious = []
        for box_idx in range(len(video[frame])):
            trk_id = video[frame][box_idx][5]
            if trk_id > -1:
                ious = []
                for i in range(-10, 11):
                    frame_c = frame + i
                    if 0 <= frame_c < len(video) and i != 0:
                        for c_box_idx in range(len(video[frame_c])):
                            c_trk_id = video[frame_c][c_box_idx][5]
                            if trk_id == c_trk_id:
                                a = video[frame][box_idx]
                                b = video[frame_c][c_box_idx]
                                ious.append(boxoverlap(a, b))
                                break

                if np.abs(np.mean(ious) - all_motion_iou[c]) > .01:
                    print('They dont match {} {} {}'.format(len(all_ious), np.mean(ious), all_motion_iou[c]))
                frame_ious.append(np.mean(ious))
                c += 1
        if frame_ious:  # if this frame has no boxes we add a 0.0
            all_ious.append(frame_ious)
        else:
            all_ious.append([0.0])
            c += 1

all_ious = np.array(all_ious)
print(len(all_motion_iou))
print(len(np.concatenate(all_ious, axis=0)))

Hope this helps :)

Hi, I don't understand what the content of imagenet_vid_groundtruth_motion_iou.mat means. Could you help me? I find it's just one column vector even after motion_iou=np.array(xxx) processing. It's beyond my expectation that there is a matrix with shape (#img, #total_classification) in it. I mean M[i,j] represents the motion iou of ith for class j. Thanks a lot~

peijl1998 avatar Apr 12 '21 08:04 peijl1998