Flow-Guided-Feature-Aggregation
Flow-Guided-Feature-Aggregation copied to clipboard
How to generate gt_motion_iou file?
I notice that lib/dataset/imagenet_vid_groundtruth_motion_iou.mat contains ious between frames on val set. For some reasons, I want to generate similar file on train set. Any code to generate this file? or it's annotated by hand?
Hello, have you solve the problem? I want to finetune my dataset.
@liangxi627 Hello, I haven't solve this problem. I guess it's annotated by people rather than generated by code.
Hi guys,
The FGFA paper suggests averaging over +-10 frames for each object instance. I have implemented a script which can do this for you (it's not very fast, but it produces the correct results):
import numpy as np
import scipy.io as sio
def boxoverlap(bb, bbgt):
ov = 0
iw = np.min((bb[2],bbgt[2])) - np.max((bb[0],bbgt[0])) + 1
ih = np.min((bb[3],bbgt[3])) - np.max((bb[1],bbgt[1])) + 1
if iw>0 and ih>0:
# compute overlap as area of intersection / area of union
intersect = iw * ih
ua = (bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) + \
(bbgt[2] - bbgt[0] + 1.) * \
(bbgt[3] - bbgt[1] + 1.) - intersect
ov = intersect / ua
return ov
# dataset is a list of videos, which are lists of frames that are lists of boxes ([xmin, ymin, xmax, ymax, cls, track]):
# eg for the val dataset:
# dataset = [ #vid ILSVRC2015_val_00000000>[#frame 000000>[[606, 417, 172, 7, 26, 0]], ... ], ...]
motion_iou = sio.loadmat('imagenet_vid_groundtruth_motion_iou.mat')
motion_iou = np.array(
[[motion_iou['motion_iou'][i][0][j][0] if len(motion_iou['motion_iou'][i][0][j]) != 0 else 0 \
for j in range(len(motion_iou['motion_iou'][i][0]))] \
for i in range(len(motion_iou['motion_iou']))])
all_motion_iou = np.concatenate(motion_iou, axis=0)
all_ious = []
c = 0
for video in enumerate(dataset):
for frame in range(len(video)):
frame_ious = []
for box_idx in range(len(video[frame])):
trk_id = video[frame][box_idx][5]
if trk_id > -1:
ious = []
for i in range(-10, 11):
frame_c = frame + i
if 0 <= frame_c < len(video) and i != 0:
for c_box_idx in range(len(video[frame_c])):
c_trk_id = video[frame_c][c_box_idx][5]
if trk_id == c_trk_id:
a = video[frame][box_idx]
b = video[frame_c][c_box_idx]
ious.append(boxoverlap(a, b))
break
if np.abs(np.mean(ious) - all_motion_iou[c]) > .01:
print('They dont match {} {} {}'.format(len(all_ious), np.mean(ious), all_motion_iou[c]))
frame_ious.append(np.mean(ious))
c += 1
if frame_ious: # if this frame has no boxes we add a 0.0
all_ious.append(frame_ious)
else:
all_ious.append([0.0])
c += 1
all_ious = np.array(all_ious)
print(len(all_motion_iou))
print(len(np.concatenate(all_ious, axis=0)))
Hope this helps :)
I notice that lib/dataset/imagenet_vid_groundtruth_motion_iou.mat contains ious between frames on val set. For some reasons, I want to generate similar file on train set. Any code to generate this file? or it's annotated by hand?
Have you you solve the problem? I also think the iou is not calculated in the code.
Hi guys,
The FGFA paper suggests averaging over +-10 frames for each object instance. I have implemented a script which can do this for you (it's not very fast, but it produces the correct results):
import numpy as np import scipy.io as sio def boxoverlap(bb, bbgt): ov = 0 iw = np.min((bb[2],bbgt[2])) - np.max((bb[0],bbgt[0])) + 1 ih = np.min((bb[3],bbgt[3])) - np.max((bb[1],bbgt[1])) + 1 if iw>0 and ih>0: # compute overlap as area of intersection / area of union intersect = iw * ih ua = (bb[2] - bb[0] + 1.) * (bb[3] - bb[1] + 1.) + \ (bbgt[2] - bbgt[0] + 1.) * \ (bbgt[3] - bbgt[1] + 1.) - intersect ov = intersect / ua return ov # dataset is a list of videos, which are lists of frames that are lists of boxes ([xmin, ymin, xmax, ymax, cls, track]): # eg for the val dataset: # dataset = [ #vid ILSVRC2015_val_00000000>[#frame 000000>[[606, 417, 172, 7, 26, 0]], ... ], ...] motion_iou = sio.loadmat('imagenet_vid_groundtruth_motion_iou.mat') motion_iou = np.array( [[motion_iou['motion_iou'][i][0][j][0] if len(motion_iou['motion_iou'][i][0][j]) != 0 else 0 \ for j in range(len(motion_iou['motion_iou'][i][0]))] \ for i in range(len(motion_iou['motion_iou']))]) all_motion_iou = np.concatenate(motion_iou, axis=0) all_ious = [] c = 0 for video in enumerate(dataset): for frame in range(len(video)): frame_ious = [] for box_idx in range(len(video[frame])): trk_id = video[frame][box_idx][5] if trk_id > -1: ious = [] for i in range(-10, 11): frame_c = frame + i if 0 <= frame_c < len(video) and i != 0: for c_box_idx in range(len(video[frame_c])): c_trk_id = video[frame_c][c_box_idx][5] if trk_id == c_trk_id: a = video[frame][box_idx] b = video[frame_c][c_box_idx] ious.append(boxoverlap(a, b)) break if np.abs(np.mean(ious) - all_motion_iou[c]) > .01: print('They dont match {} {} {}'.format(len(all_ious), np.mean(ious), all_motion_iou[c])) frame_ious.append(np.mean(ious)) c += 1 if frame_ious: # if this frame has no boxes we add a 0.0 all_ious.append(frame_ious) else: all_ious.append([0.0]) c += 1 all_ious = np.array(all_ious) print(len(all_motion_iou)) print(len(np.concatenate(all_ious, axis=0)))
Hope this helps :)
Hi, I don't understand what the content of imagenet_vid_groundtruth_motion_iou.mat
means. Could you help me? I find it's just one column vector even after motion_iou=np.array(xxx)
processing. It's beyond my expectation that there is a matrix with shape (#img, #total_classification) in it. I mean M[i,j] represents the motion iou of ith for class j. Thanks a lot~