manydepth icon indicating copy to clipboard operation
manydepth copied to clipboard

get_depth never enabled

Open didriksg opened this issue 3 years ago • 3 comments

Hi, and thank you for your contribution!

I have earlier trained monodepth2 with the Lyft dataset with success, and I'm trying to train manydepth with the same dataloader (with some modifications e.g., for the new load_intrinsics() function.). When using gt depth generated from lidar scans from an onboard lidar, I noticed that the functionality is never called, even though check_depth() returns True. After looking in the MonoDataset, I noticed on line 192 that it seems this functionality is disabled. Is this intentional?

From MonoDataset;

        if self.load_depth and False:
            depth_gt = self.get_depth(folder, frame_index, side, do_flip)
            inputs["depth_gt"] = np.expand_dims(depth_gt, 0)
            inputs["depth_gt"] = torch.from_numpy(inputs["depth_gt"].astype(np.float32))

I tried removing the additional False, but it seems that the lidar data in the Lyft dataset does not have points divisible by 4, as per this ValueError:

  File "/cluster/work/didriksg/depth_detection/manydepth/manydepth/kitti_utils.py", line 70, in generate_depth_map
    velo = load_velodyne_points(velo_filename)
  File "/cluster/work/didriksg/depth_detection/manydepth/manydepth/kitti_utils.py", line 16, in load_velodyne_points
    points = np.fromfile(filename, dtype=np.float32).reshape(-1, 4)
ValueError: cannot reshape array of size 555895 into shape (4)

I suppose a solution here is to drop the three last/first points so that the number of points is divisible by 4?

I also have some questions regarding some suspicious-looking loss, but I will look a bit more into it and possibly post it in a separate issue.

didriksg avatar May 15 '21 18:05 didriksg

Hi - thanks for your interest, and sorry for the delayed response.

Good catch about the disabling of loading depths, that is unintentional! Thankfully it doesn't affect the training of Manydepth, but will push a fix shortly.

I am not familiar with the Lyft dataset so I am probably not the best source of information - however I believe that KITTI velodyne data has 4 values per point (z, x, y, reflectance), and that is why it is reshaped into (num_points x 4) in line16 of load_velodyne_points. Do you know how the Lyft lidar data is stored? If it is stored in some other format then you will need to amend this function accordingly.

JamieWatson683 avatar May 24 '21 14:05 JamieWatson683

Hey, I checked out the Lyft lidar format. Apparently, they store their lidar data as (x, y, z, intensity, ring_index). Here's their code for reading a .bin file containing the points: https://github.com/lyft/nuscenes-devkit/blob/8b55159e89d6318f143bd44dbdfde99ad7ff72e8/lyft_dataset_sdk/utils/data_classes.py#L259-L284

The output from this reading is a (4, n_points) array in (x, y, z, intensity) format.

I see that they also have code for generating a depth map, which is what I need: https://github.com/lyft/nuscenes-devkit/blob/8b55159e89d6318f143bd44dbdfde99ad7ff72e8/lyft_dataset_sdk/lyftdataset.py#L736-L798

What I can probably do, is to generate these maps offline and load them in the dataloader. I will try it and come back to you with an update on the result.

didriksg avatar May 27 '21 16:05 didriksg

Hi again! I was able to generate depth maps for the Lyft data, which can be loaded in the data loader directly. However, I noticed that the compute_depth_losses function in trainer.py only seems to support depth maps in the same dimension as data from Kitti due to the cropping done here: https://github.com/nianticlabs/manydepth/blob/28fbbcfa4370eb9c28860b8ab72b09547c0d14d0/manydepth/trainer.py#L662-L665

For my own training purposes, I have disabled the cropping in the compute_depth_losses function and moved the cropping out to the dataloaders, and I think it would be better overall if this cropping happens in the dataloader to support custom datasets.

didriksg avatar May 30 '21 16:05 didriksg