Pytorch_Generalized_3D_Lane_Detection icon indicating copy to clipboard operation
Pytorch_Generalized_3D_Lane_Detection copied to clipboard

关于R_g2c计算方式的疑惑

Open lfydegithub opened this issue 4 years ago • 5 comments

tools/utils.py文件:

def homograpthy_g2im(cam_pitch, cam_height, K):
    # transform top-view region to original image region
    R_g2c = np.array([[1, 0, 0],
                      [0, np.cos(np.pi / 2 + cam_pitch), -np.sin(np.pi / 2 + cam_pitch)],
                      [0, np.sin(np.pi / 2 + cam_pitch), np.cos(np.pi / 2 + cam_pitch)]])
    H_g2im = np.matmul(K, np.concatenate([R_g2c[:, 0:2], [[0], [cam_height], [0]]], 1))
    return H_g2im

R_g2c是车体坐标系绕x轴旋转90+pitch度的矩阵可以理解, 为什么H_g2im却只取前两列呢?

GeoNet3D_ext.py文件 230行:

        # homograph ground to camera
        # H_g2cam = np.array([[1,                             0,               0],
        #                     [0, np.cos(np.pi / 2 + cam_pitch), args.cam_height],
        #                     [0, np.sin(np.pi / 2 + cam_pitch),               0]])
        H_g2cam = np.array([[1,                  0,               0],
                            [0, np.sin(-cam_pitch), args.cam_height],
                            [0, np.cos(-cam_pitch),               0]])

这里H_g2cam其实是 np.concatenate([R_g2c[:, 0:2], [[0], [cam_height], [0]]], 1)的结果, 但是为什么又与上面的定义方式不同了呢?

lfydegithub avatar Feb 24 '21 09:02 lfydegithub

        self.H_ipm2g = cv2.getPerspectiveTransform(np.float32([[0, 0],
                                                              [self.ipm_w-1, 0],
                                                              [0, self.ipm_h-1],
                                                              [self.ipm_w-1, self.ipm_h-1]]),
                                                   np.float32(args.top_view_region))

为什么ipm到top-view的H要叫做H_ipm2g? 然后这个H_ipm2g竟然真的左乘H_g2im得到了H_ipm2im, 然后求其逆, 得到了H_im2ipm. 上面的代码求出来的不应改是H_ipm2gflat吗? top-view和3d ground几件不应该还差了一个几何关系吗?

def transform_lane_g2gflat(h_cam, X_g, Y_g, Z_g):
    """
        Given X coordinates in flat ground space, Y coordinates in flat ground space, and Z coordinates in real 3D ground space
        with projection matrix from 3D ground to flat ground, compute real 3D coordinates X, Y in 3D ground space.

    :param P_g2gflat: a 3 X 4 matrix transforms lane form 3d ground x,y,z to flat ground x, y
    :param X_gflat: X coordinates in flat ground space
    :param Y_gflat: Y coordinates in flat ground space
    :param Z_g: Z coordinates in real 3D ground space
    :return:
    """

    X_gflat = X_g * h_cam / (h_cam - Z_g)
    Y_gflat = Y_g * h_cam / (h_cam - Z_g)

    return X_gflat, Y_gflat

lfydegithub avatar Feb 24 '21 13:02 lfydegithub

tools/utils.py文件:

def homograpthy_g2im(cam_pitch, cam_height, K):
    # transform top-view region to original image region
    R_g2c = np.array([[1, 0, 0],
                      [0, np.cos(np.pi / 2 + cam_pitch), -np.sin(np.pi / 2 + cam_pitch)],
                      [0, np.sin(np.pi / 2 + cam_pitch), np.cos(np.pi / 2 + cam_pitch)]])
    H_g2im = np.matmul(K, np.concatenate([R_g2c[:, 0:2], [[0], [cam_height], [0]]], 1))
    return H_g2im

R_g2c是车体坐标系绕x轴旋转90+pitch度的矩阵可以理解, 为什么H_g2im却只取前两列呢?

GeoNet3D_ext.py文件 230行:

        # homograph ground to camera
        # H_g2cam = np.array([[1,                             0,               0],
        #                     [0, np.cos(np.pi / 2 + cam_pitch), args.cam_height],
        #                     [0, np.sin(np.pi / 2 + cam_pitch),               0]])
        H_g2cam = np.array([[1,                  0,               0],
                            [0, np.sin(-cam_pitch), args.cam_height],
                            [0, np.cos(-cam_pitch),               0]])

这里H_g2cam其实是 np.concatenate([R_g2c[:, 0:2], [[0], [cam_height], [0]]], 1)的结果, 但是为什么又与上面的定义方式不同了呢?

这里可以理解为地平面坐标系到相机坐标系的转换,而地平面坐标系是地面坐标系的xoy平面,上面的点的z坐标都为0,R * [x, y, 0].T + [0, h, 0].T 等价于 [R[:, :2], [0, h, 0].T] * [x, y, 1].T

huxycn avatar Oct 19 '21 08:10 huxycn

R_g2c = np.array([[1, 0, 0],
                  [0, np.cos(np.pi / 2 + cam_pitch), -np.sin(np.pi / 2 + cam_pitch)],
                  [0, np.sin(np.pi / 2 + cam_pitch), np.cos(np.pi / 2 + cam_pitch)]])

为什么[1,2]是sin而不是-sin呢 image

byeciao avatar Mar 13 '22 14:03 byeciao

Why i think that H_g2im should be H_g2im = np.matmul(K, np.concatenate([R_g2c[:, 0:2], [[0], [cam_height * np.cos(cam_pitch)], [cam_height * np.sin(cam_pitch)]]], 1)) Is there something wrong?

homothetic avatar Mar 25 '23 13:03 homothetic

tools/utils.py文件:

def homograpthy_g2im(cam_pitch, cam_height, K):
    # transform top-view region to original image region
    R_g2c = np.array([[1, 0, 0],
                      [0, np.cos(np.pi / 2 + cam_pitch), -np.sin(np.pi / 2 + cam_pitch)],
                      [0, np.sin(np.pi / 2 + cam_pitch), np.cos(np.pi / 2 + cam_pitch)]])
    H_g2im = np.matmul(K, np.concatenate([R_g2c[:, 0:2], [[0], [cam_height], [0]]], 1))
    return H_g2im

R_g2c是车体坐标系绕x轴旋转90+pitch度的矩阵可以理解, 为什么H_g2im却只取前两列呢? GeoNet3D_ext.py文件 230行:

        # homograph ground to camera
        # H_g2cam = np.array([[1,                             0,               0],
        #                     [0, np.cos(np.pi / 2 + cam_pitch), args.cam_height],
        #                     [0, np.sin(np.pi / 2 + cam_pitch),               0]])
        H_g2cam = np.array([[1,                  0,               0],
                            [0, np.sin(-cam_pitch), args.cam_height],
                            [0, np.cos(-cam_pitch),               0]])

这里H_g2cam其实是 np.concatenate([R_g2c[:, 0:2], [[0], [cam_height], [0]]], 1)的结果, 但是为什么又与上面的定义方式不同了呢?

这里可以理解为地平面坐标系到相机坐标系的转换,而地平面坐标系是地面坐标系的xoy平面,上面的点的z坐标都为0,R * [x, y, 0].T + [0, h, 0].T 等价于 [R[:, :2], [0, h, 0].T] * [x, y, 1].T

感觉这里还是诲涩难懂,相机坐标系怎么定义,地面坐标系怎么定义,应该说清楚,和文章里的示意图也不一致

hitbuyi avatar Feb 16 '24 14:02 hitbuyi