Yi Wei
Yi Wei
Yeah, our generated labels are panoptic occupancy labels, not monocular ones.
原始的gt是最高分辨率的,所以要得到低分辨率的标签需要进行pooling下采样而不是插值,但简单的average pooling和max pooling显然不适合occupancy gt的下采样。理论上的做法应该是对比如2x2x2的格子里取众数得到下采样的gt,但取众数这个操作比较难以批量化处理且速度较慢,因此采用了近似方法。
1. 我们这里没有求众数,原始的gt是个Nx4的tensor,N表示N个点,4维向量分别是最高分辨率的xyz坐标和相应的semantic label。我们需要将N个点的信息通过索引的方式嵌入到HxWxZ的tensor里。对于最高分辨率的tensor,我们直接索引就行了,但对于低分辨率的,我们首先需要对xyz坐标进行下采样再去索引。 2. semantic label不能直接取平均的。例如原始2x2x2的格子里有4个格子的label是4,有4个是0,那取完平均就变成了2,label完全不对了啊。
嗯嗯是的,同学你的理解是对的~
Hi, we use 8 RTX 3090s and the entire validation evaluation is about 8 minutes. I think may be something wrong?
Hi, this is a bug of file path and thank you to point it out. We have fixed it and you can try it now with the latest code.
hi, our open3d version is 0.9.0.
Although our model is better than other SOTA methods with the same ground truth, we argue that the dense groundtruth is necessary for the dense occupancy prediction task. For other...
It will take about 2.5 days, which is similar to BEVFormer's training time.
Thank you! We have fixed this typo!