NeFSAC icon indicating copy to clipboard operation
NeFSAC copied to clipboard

a question about the training time on kitti dataset

Open xuqch-77 opened this issue 2 years ago • 0 comments

Hello Cavalli,

Thanks for this good work.

I have a question about the training time on kitti dataset.

when I try the training pipeline with default configurations, it costs more than 5 seconds per iteration after caching the matching result. The version of CUDA I used is 11.6 with NVIDIA A100 GPU device.

The default configuration of maxlen in method main of train.py is 2, if I change the maxlen to None, it could use all of the image pairs in sequences 0, 1, 2, 3, 4 for one epoch, right? so the total number of pairs is 98784, and the training time is more than 137 hours per epoch. if we want to train more than one epoch, It takes a long time.

I wondering what are the configurations of maxlen and training epoch in your experiments.

By the way, I got two errors when running the training pipeline,

  1. the first error is CUDA error: device-side assert triggered. when debuging, I find that there are some nans in poses of compute_rt_error. The error is gone after I add code poses = [pose for pose in poses if np.isnan(pose[0]).sum() == 0 and np.isnan(pose[1]).sum() == 0] before line 154, https://github.com/cavalli1234/NeFSAC/blob/f96595b3281e42fe41091e91719ac402e1cd5f1f/source/data.py#L144-L159
  2. the second error is name 'cache_path' is not defined I guess it shoud be data_cache_path in line 42 of data.py https://github.com/cavalli1234/NeFSAC/blob/f96595b3281e42fe41091e91719ac402e1cd5f1f/source/data.py#L39-L48

xuqch-77 avatar Jan 29 '23 08:01 xuqch-77