alexander
alexander
Your paper says that the conv 3x3 destroy the correspondence between 2D image feature and 3D position. However, from the point cloud view, image lacks the ability to discovery 3D...
Thanks for your work. Recently, i want to do some more work on BEVDepth, do you have tried Vit backbone? and what's the performance?
LINE 118: cam_data['calibrated_sensor_token']) ==> sweep_lidar_data['calibrated_sensor_token'])
i have retrained the bev_depth_lss_r50_256x704_128x128_24e_2key.py under no depth supervision for 2 epochs, however, the detection loss still floats around 25. it's normal? need more trainning time?
hello, dear author , do you have vertified that if downsample the rate from 4 to 16, the model has a drop on performance ? like this one in_channels=[256, 512,...
hi, dear authors, i tried to replace the backbone from IPM_encoder to the BEVDepth's LSS style, however, the gen_loss seems not converge after 2 epoch. Do you think it's ok...
hello, Can you show the loss curve?
Acctually, the depth of foreground should vary smoothly, however, during downsample, lidar depths vary greatly between adjacent areas. 
hi, i review voxelnet_3d_bbox, you use sam to get mask, then, you use voxel to generate proposals. but the voxelnet is restricted by trained dataset, so could it be generanized...