RPMNet
RPMNet copied to clipboard
DCP_v2 pretrained model required && Multi-GPU implemetations
Hello! Thank you for open-sourcing this impressive work, I have several questions:
- Could you provide the DCP_v2 checkpoints? I trained the DCP for 250 epochs (noise-free, 5112/1266 pairs for training/testing) in order to get the results in table 1, but the results are far from good.
- I could get the results in table 1~3 using your providing RPMNet checkpoints, but training for 1k epochs using a single GPU is time-consuming, have you ever consider implementing your code with multi-GPU? I tried by simply adding
model = nn.DataParallel(model)
and it reports a lot of bugs.
Thanks very much for your help!
Hi,
-
Sure, you can find my trained checkpoints of DCP here. I didn't check through the files, although these should be the correct files. But let me know if the results are weird. Also, it's been some time since I ran this code, but if I recall correctly, the training doesn't work well with smaller batch sizes so you have to run it with the batch size in the original code.
-
I haven't not tried running the code over multiple GPUs, so can't help you with this.
Zi Jian
Thanks for your timely reply!
During these days I retrained the DCP_v2 for 250 epochs using the following settings:
unseen -> True: first 20 classes for training, and the rest 20 classes for testing, i.e. 5112/1266 training/testing pairs.
batch_size -> 32.
noise_type: clean (no gaussian noise, no crop)
The training used the original code released by the DCP, and again here is the testing result:
-
load the model you provided: unseen-clean.t7 ==FINAL TEST== A--------->B EPOCH:: -1, Loss: 0.002211, Cycle Loss: 0.000000, MSE: 0.243007, RMSE: 0.492957, MAE: 0.386552, rot_MSE: 12.550731, rot_ RMSE: 3.542701, rot_MAE: 2.256805, trans_MSE: 0.000035, trans_RMSE: 0.005906, trans_MAE: 0.004456 B--------->A EPOCH:: -1, Loss: 0.002211, MSE: 0.243007, RMSE: 0.492957, MAE: 0.367413, rot_MSE: 12.550731, rot_RMSE: 3.542701, rot_MA E: 2.256805, trans_MSE: 0.000575, trans_RMSE: 0.023973, trans_MAE: 0.014308 FINISH
-
load the model I trained for 250 epochs ./checkpoints/dcp_v2/models/model.best.t7 ==FINAL TEST== A--------->B EPOCH:: -1, Loss: 0.012229, Cycle Loss: 0.000000, MSE: 0.241522, RMSE: 0.491449, MAE: 0.384040, rot_MSE: 58.156357, rot_ RMSE: 7.626031, rot_MAE: 5.146891, trans_MSE: 0.000673, trans_RMSE: 0.025949, trans_MAE: 0.019192 B--------->A EPOCH:: -1, Loss: 0.012229, MSE: 0.241522, RMSE: 0.491449, MAE: 0.365895, rot_MSE: 58.156357, rot_RMSE: 7.626031, rot_MA E: 5.146891, trans_MSE: 0.003950, trans_RMSE: 0.062847, trans_MAE: 0.042664 FINISH
It is clear that my training using the default settings can't reproduce the performance as the pretrained model you provided for DCP. I'm wondering how exactly you train your own DCP model.
:)
By the way, thanks very much for your help!