VIBE
VIBE copied to clipboard
I really couldn't understand the boxes here
The box send to vide is clearly x1y1x2y2....
But why it read as cxcyh? inside this func??
def convert_crop_cam_to_orig_img(cam, bbox, img_width, img_height):
'''
Convert predicted camera from cropped image coordinates
to original image coordinates
:param cam (ndarray, shape=(3,)): weak perspective camera in cropped img coordinates
:param bbox (ndarray, shape=(4,)): bbox coordinates (c_x, c_y, h)
:param img_width (int): original image width
:param img_height (int): original image height
:return:
'''
cx, cy, h = bbox[:,0], bbox[:,1], bbox[:,2]
hw, hh = img_width / 2., img_height / 2.
sx = cam[:,0] * (1. / (img_width / h))
sy = cam[:,0] * (1. / (img_height / h))
tx = ((cx - hw) / hw / sx) + cam[:,1]
ty = ((cy - hh) / hh / sy) + cam[:,2]
orig_cam = np.stack([sx, sy, tx, ty]).T
return orig_cam
bbox coordinates:[cx cy w h] reference line 156 in mpt.py. I can not understand
cx, cy, h = bbox[:,0], bbox[:,1], bbox[:,2]
why not:
cx, cy, w,h = bbox[:,0], bbox[:,1], bbox[:,2],bbox[:,2]
i see h = w in bbox