DROID-SLAM icon indicating copy to clipboard operation
DROID-SLAM copied to clipboard

Training for 80000 times

Open jiesico opened this issue 2 years ago • 13 comments

Hi! Thank you very much for sharing the code of DROID_SLAM. I've trained 80,000 times on the Tartanair dataset so far, but why hasn't loss_function shown any signs of convergence so far?

jiesico avatar Mar 23 '22 09:03 jiesico

Hi,

We are experiencing a loss like this during training Tartanair.

Screenshot from 2022-03-29 08-51-44

Is this something expected?

Thanks

fabiopoiesi avatar Mar 29 '22 06:03 fabiopoiesi

Can you use the weight file you got for testing? When I test with the weights I got from training, the following error is reported: Traceback (most recent call last): File "/root/docker2/droid/2new/DROID-SLAM/demo.py", line 117, in traj_est = droid.terminate(image_stream(args.imagedir, args.calib, args.stride)) File "/root/docker2/droid/2new/DROID-SLAM/droid_slam/droid.py", line 81, in terminate self.backend(7) File "/root/anaconda3/envs/droidenv5/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 28, in decorate_context return func(*args, **kwargs) File "/root/docker2/droid/2new/DROID-SLAM/droid_slam/droid_backend.py", line 33, in call graph.add_proximity_factors(rad=self.backend_radius, File "/root/docker2/droid/2new/DROID-SLAM/droid_slam/factor_graph.py", line 368, in add_proximity_factors ii, jj = torch.as_tensor(es, device=self.device).unbind(dim=-1) ValueError: not enough values to unpack (expected 2, got 0)

xhangHU avatar May 03 '22 13:05 xhangHU

@xhangHU Hi, I have run the model with the weights they provide successfully but met the same problem as I used my self-trained weights. Did u solve this problem yet?

realXiaohan avatar Jun 06 '22 09:06 realXiaohan

@xhangHU Hi, I have run the model with the weights they provide successfully but met the same problem as I used my self-trained weights. Did u solve this problem yet?

It's not solved yet, and I checked that the network structure used in the pre-training model provided by the author is not the same as in the code provided

xhangHU avatar Jun 06 '22 09:06 xhangHU

@xhangHU Hi, can I have your email please? We can talk about it in more detail.

Hi, could I join yours? I would like to retrain the model but got stuck in the beginning. Thanks a lot. 12131040[at]mail.sustech.edu.cn

lvmingzhe avatar Jun 07 '22 03:06 lvmingzhe

@xhangHU Hi, I have run the model with the weights they provide successfully but met the same problem as I used my self-trained weights. Did u solve this problem yet?

It's not solved yet, and I checked that the network structure used in the pre-training model provided by the author is not the same as in the code provided

It maybe cause by, the distance of graph is loss than the thresh. Try to give a smaller thresh, the demo will run successfully. But I still meet some problem with the disps result

519174419 avatar Jun 10 '22 06:06 519174419

@519174419 Yeah, I solved the problem by setting a smaller thresh but the ego-motion prediction looks so weird with my self-trained model. Is anything wrong with your disparity map?

realXiaohan avatar Jun 10 '22 13:06 realXiaohan

**realXiaohan ** commented 16分钟前

Yes,I think the flow_loss and geo_loss is low, but it still something wrong with the disparity map.

519174419 avatar Jun 10 '22 13:06 519174419

@519174419 We can talk about it in more detail and my email is [[email protected]].

realXiaohan avatar Jun 10 '22 14:06 realXiaohan

@519174419 Yeah, I solved the problem by setting a smaller thresh but the ego-motion prediction looks so weird with my self-trained model. Is anything wrong with your disparity map?

Hi @realXiaohan , could you please share your config for training and demo ?

YznMur avatar Jun 15 '22 08:06 YznMur