MVDNet labels about ground-truth

Hi, thanks for your sharing about the bounding-box-annotations of ORR dataset. I have some questions about the annotations (such as units and meaning). Could you please add some notes? Best wishes

May 08 '21 06:05 Etah0409

The format of the 2D labels is: Class, ID, Center Loc X (m), Center Loc Y (m), Width (m), Length (m), Yaw (degree)

The format of the 3D labels is the same as the KITTI labels.

The yaw angle is the clockwise angle from the heading direction of the ego vehicle in the bird's eye view.

May 09 '21 18:05 qiank10

The format of the 2D labels is: Class, ID, Center Loc X (m), Center Loc Y (m), Width (m), Length (m), Yaw (degree)

The format of the 3D labels is the same as the KITTI labels.

The yaw angle is the clockwise angle from the heading direction of the ego vehicle in the bird's eye view.

Thanks for your kindly reply! It really helps a lot.

May 10 '21 07:05 Etah0409

Hi there, I am working on this demo recently, but come across a paradoxical problem here when drawing output result. I used pyplotlib.matplot.imshow(radar raw image), and plt.scatter(lidar point cloud) to plot radar raw image and lidar point cloud, which matched each other good as below: In this figure, the blue boxes are label boxes,the red dots are the centers of label boxes. The red boxes are model prediction boxes and blue dots are the centers of result boxes.Yellow points are projected from point cloud and the back image is radar raw image. As u can see, When I trying to plot the result bbox and label bbox onto that figure, I found out that the label bbox center x and center y is located correctly in the up figure. But u can never move the center points of the boxes to any of the corner of the lidar point cloud with minus or plus a half of width or length of the box to accomplished both sides of the road bbox. See if its correct, I upload my result-viewing code:

import matplotlib.pyplot as plt import matplotlib.patches as patches import numpy as np import cv2 from PIL import Image import os

inputDataPath = '/data/2019-01-10-11-46-21-radar-oxford-10k/processed/'

def readLidarData(filename): lidarData = np.fromfile(filename, dtype=np.float32) lidarData = np.reshape(lidarData,(-1,4)) [pcNum,pcDim] = np.shape(lidarData) x_arr = [] y_arr = [] z_arr = [] intensity_arr = [] for i in range(pcNum): # coordinate transfer indx_x = 319-5 * (lidarData[i, 0]+32) indx_y = 5 * (lidarData[i, 1]+32) indx_z = 5*(lidarData[i,2]) intensity = lidarData[i,3] if indx_x > 320 or indx_x < 0 or indx_y > 320 or indx_y < 0: continue # pack them together x_arr.append(indx_x) y_arr.append(indx_y) z_arr.append(indx_z) intensity_arr.append(intensity) return x_arr, y_arr, z_arr, intensity_arr

def readRawIndex(): dict_temp = {} fn_evalIndex = os.path.join(inputDataPath,'ImageSets','eval.txt') file = open(fn_evalIndex) for line in file.readlines(): line = line.strip() k = line.split(' ')[0] v = line.split(' ')[1] dict_temp[k] = v return dict_temp

def readResult(): filename = './output/mvdnet/instances_predictions.csv' return np.loadtxt(filename)

def readLabel(labelFilename): this_labelFilename = os.path.join(inputDataPath, 'label_2d', labelFilename) class_arr = [] id_arr = [] leftup_x_arr = [] leftup_y_arr = [] width_arr = [] height_arr = [] angle_arr = [] with open(this_labelFilename, "r") as f: for line in f.readlines(): tmp = line.split(' ') class_arr.append(tmp[0]) id_arr.append(int(tmp[1])) # center-width/2+center_image # leftup_x = 5 * (float(tmp[2])+float(tmp[4])/2) + 319/2 # leftup_y = 5 * (float(tmp[3])+float(tmp[5])/2) + 319/2 leftup_x = 5 * (float(tmp[2])) + 319/2 leftup_y = 5 * (float(tmp[3])) + 319/2 leftup_x_arr.append(leftup_x) leftup_y_arr.append(leftup_y) width_arr.append(5float(tmp[4])) height_arr.append(5float(tmp[5])) # angle adjust angle = float(tmp[6]) # if angle >= -112.5 and angle < 67.5: # direction = 0 # else: # direction = 1 # if angle < -112.5: # angle = angle + 180 # else: # angle = angle - 180 angle_arr.append(angle) return class_arr,id_arr,leftup_x_arr,leftup_y_arr,width_arr,height_arr,angle_arr

def drawOutput(resultMat, rawIndexDict): [objNUm, resDim] = np.shape(resultMat) frameNum = 2799 fig = plt.figure() for i in range(objNUm): # i+=100 frameNumNow = int(resultMat[i, 0]) timestamp = rawIndexDict[str(frameNumNow)] fig.canvas.set_window_title(str(timestamp)) # read raw data images if frameNumNow != frameNum: print("frame num: {}".format(frameNum)) # read lidar data lidarFilename = '{}.bin'.format(timestamp) lidar_rawDataFilename = os.path.join(inputDataPath,'lidar',lidarFilename) [x_arr, y_arr, z_arr, intensity_arr] = readLidarData(lidar_rawDataFilename) # read radar data radarFilename = '{}.jpg'.format(timestamp) radar_rawDataFilename = os.path.join(inputDataPath,'radar',radarFilename) # draw point cloud and radar image plt.scatter(y_arr, x_arr,s=0.05,c='yellow') plt.imshow(Image.open(radar_rawDataFilename)) # load label bbox # read label 2d labelFilename = '{}.txt'.format(timestamp) [class_arr, id, label_leftup_x, label_leftup_y, label_width, label_height, label_angle] = readLabel(labelFilename) for j in range(len(id)): plt.gca().add_patch(patches.Rectangle((label_leftup_x[j], label_leftup_y[j]), label_width[j], label_height[j], # angle = 180angle/np.pi, angle=180-label_angle[j], edgecolor='blue', facecolor='none', lw=2)) plt.plot(label_leftup_x[j], label_leftup_y[j],'ro') # load result bbox height = resultMat[i, 3] width = resultMat[i, 4] leftup_x = resultMat[i, 1]#+width/2 leftup_y = resultMat[i, 2]#+height/2 angle = resultMat[i, 5] # draw them together if frameNumNow == frameNum: plt.gca().add_patch(patches.Rectangle((leftup_x,leftup_y),width,height, # angle = 180angle/np.pi, angle = 90-angle, edgecolor='red', facecolor='none', lw=2)) plt.plot(leftup_x,leftup_y,'bo') else: # plt.hold(False) plt.pause(0.1) if frameNumNow == 2830: plt.pause(0) plt.clf() frameNum = frameNumNow

def main(): resultMat = readResult() rawDataIndexDict = readRawIndex() drawOutput(resultMat, rawDataIndexDict) pass

if name == "main": main()

I'm keen to get some help, thank u.

Sorry I may have solved this problem by doing rotation before translation. My fault, sorry.

Oct 18 '21 03:10 yyxr75

I want to visualize the labels too. Can you provide the code? Thank you for your help!

Dec 03 '21 10:12 304886938

sorry to tell you that above is all my code thanks

Dec 14 '21 09:12 yyxr75

@yyxr75 Could you tell me how to get instances_predictions.csv? When I run eval.py I get instances_predictions.pth. Is there a code to convert .pth to .csv?

Thank u

Mar 24 '23 17:03 MaiRajborirug

have you solve it? thanks.

Mar 28 '23 12:03 neverstoplearn

have you solve it? thanks.

@MaiRajborirug @yyxr75

Mar 28 '23 12:03 neverstoplearn

@neverstoplearn @yyxr75 I can't find the instances_predictions.csv. So I write my own readResult function to store results from instances_predictions.pth. Let me know if you find any bugs or offset / scale adjustments in the function. My result in visualization for frame '2799' (1547121487422169.bin) looks weird. The ground truth (blue) and the red label (prediction) should be more overlapped with each other. Seem like it has the same top-left points (blue and red dots) but the code might draw the bounding boxes' width in the incorrect direction.

def readResult(filename = '../output/mvdnet/instances_predictions.pth'):
    '''Array
    frameNumNow = resultMat[i, 0] -> 2799
    leftup_x = resultMat[i, 1]#+width/2
    leftup_y = resultMat[i, 2]#+height/2
    height = resultMat[i, 3]
    width = resultMat[i, 4]
    angle = resultMat[i, 5]
    red color for prediction
    '''
    checkpoint = torch.load(filename)
    len_eval = len(checkpoint)  # number of evaluation frame
    resultMat = []
    for i in range(len_eval):
        cars = checkpoint[i]['instances']
        frameNumNow = int(checkpoint[i]['image_id'])
        for j in range(len(cars)):
            leftup_x = cars[j]['bbox'][0]#+width/2
            leftup_y = cars[j]['bbox'][1]#+height/2
            height = cars[j]['bbox'][2]
            width = cars[j]['bbox'][3]
            angle = cars[j]['bbox'][4]
            score = cars[j]['score']
            resultArr = [frameNumNow, leftup_x, leftup_y, height, width, angle, score]
            resultMat.append(resultArr)
    resultMat = np.array(resultMat)
    return resultMat

Mar 30 '23 14:03 MaiRajborirug

thanks。

Mar 31 '23 02:03 neverstoplearn

r kindly reply! It really

Can you correctly connect 3Dbboxs with homologous objtcts？I ran every possible coordinate transformation than I used the open3d to visualize the scene and bbox. I tried my best, but I can't connect them.😭😭 I've been stumped by this question for a long time. Can you help me?

Sep 20 '23 03:09 Afreshbird

MVDNet MVDNet copied to clipboard

labels about ground-truth

MVDNet
MVDNet copied to clipboard