RenderOcc
RenderOcc copied to clipboard
How to use `camera_mask` during training?
Hi, when attempting to train RenderOcc using camera_mask, my results can only reach a maximum mIoU of $30.53$, instead of the $40-50$ reported by UniOcc. Could you please share how you used camera_mask for training?
Currently my approach is:
def loss_3d(self, voxel_semantics, camera_mask, density_prob, semantic):
voxel_semantics=voxel_semantics.long()
voxel_semantics=voxel_semantics.reshape(-1)
density_prob=density_prob.reshape(-1, 2)
semantic = semantic.reshape(-1, self.num_classes-1)
density_target = (voxel_semantics==17).long()
semantic_mask = voxel_semantics!=17
camera_mask = camera_mask.reshape(-1)
density_prob = density_prob[camera_mask]
density_target = density_target[camera_mask]
valid_mask = torch.logical_and(semantic_mask, camera_mask)
voxel_semantics = voxel_semantics[valid_mask]
semantic = semantic[valid_mask]
# compute loss
loss_geo=self.loss_occ(density_prob, density_target)
loss_sem = self.semantic_loss(semantic, voxel_semantics.long())
loss_ = dict()
loss_['loss_3d_geo'] = loss_geo
loss_['loss_3d_sem'] = loss_sem
return loss_