HEDNet icon indicating copy to clipboard operation
HEDNet copied to clipboard

About the performance on the nuscenes validation set

Open zheng0819 opened this issue 1 year ago • 9 comments

Hello, I'm very interested in this safdnet work, when I reproduce it (8 RTX3090 GPUs), it performs 66.15% mAP and 70.41% NDS on nuscenes validation set,the performance is lower than the claimed 66.3% mAP and 71.0% NDS (especially NDS). I dont know the reason for this, is it because Transfusion_L head in OpenPCDet framework does not perform as well as mmdetection3d? Can you offer some advice on reproducing the work, much appreciated!

zheng0819 avatar Jul 04 '24 01:07 zheng0819

他这个66.7%mAP,很难超越的,我的test点数比他高1.5%mAP,val都没有超过他

WenxuanLi-whu avatar Jul 09 '24 09:07 WenxuanLi-whu

他这个66.7%mAP,很难超越的,我的test点数比他高1.5%mAP,val都没有超过他

您好,请问您复现的NDS也低的很明显吗

zheng0819 avatar Jul 11 '24 08:07 zheng0819

我就一刷点的,他的我没有跑哈,我的Co-Fix3D效果比他的好一点哈,希望作者遇到了给个7+分啊,如果这篇能中的话,我就开源,我是4卡跑的,让你们8卡的帮我跑一波,看看val到底能刷到多少。而且我现在的融合的一篇就可以弄出去了,我现在融合的val刷到了72.0%mAP,73.7%NDS。

WenxuanLi-whu avatar Jul 11 '24 08:07 WenxuanLi-whu

看看我的工程,吊打目前所有的3D目标检测 哈哈 https://github.com/rubbish001/Co-Fix3d

WenxuanLi-whu avatar Jul 18 '24 08:07 WenxuanLi-whu

看看我的工程,吊打目前所有的3D目标检测 哈哈 https://github.com/rubbish001/Co-Fix3d

@rubbish001 你好,请问下你这个项目里面的贴图是在nuScenes的榜单上吗?我在官方网站上好像没找到你的方法 nuScenes

ihaohe avatar Jul 26 '24 03:07 ihaohe

里面有个提交的页面,上面展示各种算法的,我这个是无测试增强搞出来的

WenxuanLi-whu avatar Jul 26 '24 08:07 WenxuanLi-whu

里面有个提交的页面,上面展示各种算法的,我这个是无测试增强搞出来的

@rubbish001 谢谢,我找到提交的那个榜单了, 看到你的Co-系列算法了。但是那个榜单好像不能区分用的哪个模态的数据吧(lidar or lidar+camera)? 另外,请问下你知道怎么从那个提交的榜单转到正式榜单吗?

ihaohe avatar Jul 26 '24 08:07 ihaohe

里面的排行很混乱,各种数据增强,测试增强,只要点数够高就可以排在前面,我的这个是没有测试增强的,最重要的是我还一篇文章都没有中,中了就开源,我感觉我这个想法估计还可以提升

WenxuanLi-whu avatar Jul 27 '24 04:07 WenxuanLi-whu

里面的排行很混乱,各种数据增强,测试增强,只要点数够高就可以排在前面,我的这个是没有测试增强的,最重要的是我还一篇文章都没有中,中了就开源,我感觉我这个想法估计还可以提升

好的好的,谢谢老哥,祝早中顶会

ihaohe avatar Jul 29 '24 05:07 ihaohe

There are my pre-trained results, training on 4 A40 GPU, lower than those in paper:

Object Class AP ATE ASE AOE AVE AAE
car 0.877 0.164 0.148 0.073 0.240 0.187
truck 0.622 0.313 0.178 0.101 0.232 0.220
bus 0.766 0.304 0.190 0.042 0.461 0.240
trailer 0.428 0.513 0.222 0.487 0.219 0.165 construction_vehicle 0.251 0.696 0.431 0.868 0.126 0.310 pedestrian 0.880 0.126 0.278 0.337 0.195 0.095 motorcycle 0.751 0.183 0.230 0.216 0.351 0.253 bicycle 0.573 0.150 0.259 0.341 0.176 0.017 traffic_cone 0.763 0.112 0.316 nan nan nan
barrier 0.686 0.190 0.290 0.054 nan nan
2024-08-21 20:19:19,215 INFO ----------------Nuscene detection_cvpr_2019 results----------------- ***car error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.16, 0.15, 0.07, 0.24, 0.19 | 79.61, 88.20, 90.89, 91.94 | mean AP: 0.8765830252951994 ***truck error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.31, 0.18, 0.10, 0.23, 0.22 | 42.52, 62.51, 70.38, 73.47 | mean AP: 0.6221782682212571 ***construction_vehicle error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.70, 0.43, 0.87, 0.13, 0.31 | 4.66, 18.05, 32.92, 44.67 | mean AP: 0.2507487669806897 ***bus error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.30, 0.19, 0.04, 0.46, 0.24 | 53.03, 77.17, 86.93, 89.35 | mean AP: 0.766220858525359 ***trailer error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.51, 0.22, 0.49, 0.22, 0.17 | 12.10, 37.79, 54.93, 66.20 | mean AP: 0.4275400267613375 ***barrier error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.19, 0.29, 0.05, nan, nan | 59.99, 68.19, 72.64, 73.74 | mean AP: 0.6864095524858678 ***motorcycle error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.18, 0.23, 0.22, 0.35, 0.25 | 66.07, 76.56, 77.99, 79.75 | mean AP: 0.750910792428777 ***bicycle error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.15, 0.26, 0.34, 0.18, 0.02 | 55.26, 57.59, 57.83, 58.34 | mean AP: 0.5725552461648097 ***pedestrian error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.13, 0.28, 0.34, 0.19, 0.10 | 86.43, 87.56, 88.54, 89.50 | mean AP: 0.8800749823206577 ***traffic_cone error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.11, 0.32, nan, nan, nan | 74.27, 75.29, 76.65, 79.02 | mean AP: 0.7630854540150979 --------------average performance------------- trans_err: 0.2751 scale_err: 0.2543 orient_err: 0.2799 vel_err: 0.2501 attr_err: 0.1859 mAP: 0.6596 NDS: 0.7053

AlanLiangC avatar Aug 22 '24 01:08 AlanLiangC

。。。

WenxuanLi-whu avatar Aug 22 '24 02:08 WenxuanLi-whu

真是胆子大,什么都可以,我也学到了,清华北大,不如胆子大,反正test集刷上去了,val随便造

@rubbish001

  1. First and foremost, please strictly adhere to the configuration file I provided for training the model, particularly with 8 GPUs and a batch size set of 8x2, before commenting on the reproducibility of my results. All models in my paper were trained with 8 GPUs. Based on my experience, using a batch size of 4x2 or 8x1 to train the models will result in performance degradation.

  2. It is important to speak with responsibility. All my model files and training logs are still stored on my server, and I have been too busy recently to organize them. I will upload the model checkpoints, training logs, and test set results by today. I can confirm that my two works, HEDNet and SAFDNet, are fully reproducible.

  3. Please leave your comments under my repository, maintaining basic courtesy.

zhanggang001 avatar Aug 22 '24 02:08 zhanggang001

对不起,我有点气愤了,我删除了我地评论 并且,我对卡和批量大小有过研究,我就4卡,我测试过2卡,4卡,3卡。他们的偏差就在0.1%

WenxuanLi-whu avatar Aug 22 '24 02:08 WenxuanLi-whu

Hello, I'm very interested in this safdnet work, when I reproduce it (8 RTX3090 GPUs), it performs 66.15% mAP and 70.41% NDS on nuscenes validation set,the performance is lower than the claimed 66.3% mAP and 71.0% NDS (especially NDS). I dont know the reason for this, is it because Transfusion_L head in OpenPCDet framework does not perform as well as mmdetection3d? Can you offer some advice on reproducing the work, much appreciated!

Sorry for the late reply. Yesterday, I released the model checkpoint and training log (mAP 66.5% | NDS 70.9%, trained with the released code on 2024/05/09). As I mentioned in the README, since I refactored the code to unify the codebase of HEDNet and SAFDNet, I only ran one experiment to validate the correctness of the code. Last night, I ran another experiment on the nuScenes dataset (mAP 66.15% and NDS 70.51%) and found that the results fluctuated. I will try to find the reason and fix the performance gap. I apologize for any confusion caused. Also, I welcome verification of our results on the Waymo dataset.

zhanggang001 avatar Aug 23 '24 10:08 zhanggang001

There are my pre-trained results, training on 4 A40 GPU, lower than those in paper:

Object Class AP ATE ASE AOE AVE AAE car 0.877 0.164 0.148 0.073 0.240 0.187 truck 0.622 0.313 0.178 0.101 0.232 0.220 bus 0.766 0.304 0.190 0.042 0.461 0.240 trailer 0.428 0.513 0.222 0.487 0.219 0.165 construction_vehicle 0.251 0.696 0.431 0.868 0.126 0.310 pedestrian 0.880 0.126 0.278 0.337 0.195 0.095 motorcycle 0.751 0.183 0.230 0.216 0.351 0.253 bicycle 0.573 0.150 0.259 0.341 0.176 0.017 traffic_cone 0.763 0.112 0.316 nan nan nan barrier 0.686 0.190 0.290 0.054 nan nan 2024-08-21 20:19:19,215 INFO ----------------Nuscene detection_cvpr_2019 results----------------- ***car error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.16, 0.15, 0.07, 0.24, 0.19 | 79.61, 88.20, 90.89, 91.94 | mean AP: 0.8765830252951994 ***truck error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.31, 0.18, 0.10, 0.23, 0.22 | 42.52, 62.51, 70.38, 73.47 | mean AP: 0.6221782682212571 ***construction_vehicle error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.70, 0.43, 0.87, 0.13, 0.31 | 4.66, 18.05, 32.92, 44.67 | mean AP: 0.2507487669806897 ***bus error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.30, 0.19, 0.04, 0.46, 0.24 | 53.03, 77.17, 86.93, 89.35 | mean AP: 0.766220858525359 ***trailer error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.51, 0.22, 0.49, 0.22, 0.17 | 12.10, 37.79, 54.93, 66.20 | mean AP: 0.4275400267613375 ***barrier error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.19, 0.29, 0.05, nan, nan | 59.99, 68.19, 72.64, 73.74 | mean AP: 0.6864095524858678 ***motorcycle error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.18, 0.23, 0.22, 0.35, 0.25 | 66.07, 76.56, 77.99, 79.75 | mean AP: 0.750910792428777 ***bicycle error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.15, 0.26, 0.34, 0.18, 0.02 | 55.26, 57.59, 57.83, 58.34 | mean AP: 0.5725552461648097 ***pedestrian error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.13, 0.28, 0.34, 0.19, 0.10 | 86.43, 87.56, 88.54, 89.50 | mean AP: 0.8800749823206577 ***traffic_cone error@trans, scale, orient, vel, attr | [email protected], 1.0, 2.0, 4.0 0.11, 0.32, nan, nan, nan | 74.27, 75.29, 76.65, 79.02 | mean AP: 0.7630854540150979 --------------average performance------------- trans_err: 0.2751 scale_err: 0.2543 orient_err: 0.2799 vel_err: 0.2501 attr_err: 0.1859 mAP: 0.6596 NDS: 0.7053

Please see the comment above. I apologize for any confusion caused, again. I will try to find the reason and fix the performance gap asap.

zhanggang001 avatar Aug 23 '24 10:08 zhanggang001

666,你确实牛,我最近又学了一招,学DAL或者IS-Fusion的 对pst_middle_sparse的通道数翻倍,还能提点,等我有A100,这个数据集我要刷爆

WenxuanLi-whu avatar Aug 23 '24 13:08 WenxuanLi-whu

666,你确实牛,我最近又学了一招,学DAL或者IS-Fusion的 对pst_middle_sparse的通道数翻倍,还能提点,等我有A100,这个数据集我要刷爆

If you want, you can just do it. One more suggestion: if you aim to achieve better performance based on TransFusion-L, you might consider using the mmdetection3d framework, as it can yield higher performance than OpenPCDet according to my experience. In the original implementation of HEDNet, we used mmdetection3d for the nuScenes dataset, but used OpenPCDet for the Waymo Open dataset. Here, we have released a unified codebase for HEDNet and SAFDNet on the Waymo, nuScenes, and Argoverse2 datasets. Actually, there are many ways to improve detection accuracy; however, the design of HEDNet and SAFDNet is not for that.

Additionally, this is the repo for HEDNet and SAFDNet, not a place to promote other work or discuss unrelated matters. It's also worth noting that this is an English repo, so please try to reply in English. Thank you.

zhanggang001 avatar Aug 23 '24 14:08 zhanggang001

Thank you for your suggestion.

WenxuanLi-whu avatar Aug 25 '24 17:08 WenxuanLi-whu

Hello, I'm very interested in this safdnet work, when I reproduce it (8 RTX3090 GPUs), it performs 66.15% mAP and 70.41% NDS on nuscenes validation set,the performance is lower than the claimed 66.3% mAP and 71.0% NDS (especially NDS). I dont know the reason for this, is it because Transfusion_L head in OpenPCDet framework does not perform as well as mmdetection3d? Can you offer some advice on reproducing the work, much appreciated!

Sorry for the late reply. Yesterday, I released the model checkpoint and training log (mAP 66.5% | NDS 70.9%, trained with the released code on 2024/05/09). As I mentioned in the README, since I refactored the code to unify the codebase of HEDNet and SAFDNet, I only ran one experiment to validate the correctness of the code. Last night, I ran another experiment on the nuScenes dataset (mAP 66.15% and NDS 70.51%) and found that the results fluctuated. I will try to find the reason and fix the performance gap. I apologize for any confusion caused. Also, I welcome verification of our results on the Waymo dataset.

Thanks for the reply! I probably get it. If possible, can you provide the reproducible code before it can be refactored, it's much appreciated, I'm now trying to reproduce SAFD-Net on nuscenes' test dataset with 8 4090GPUs, but the reproduced mAP is only 67.44 and NDS is only 71.80 at the moment (without using double flip).

zheng0819 avatar Aug 26 '24 08:08 zheng0819

Hello, I'm very interested in this safdnet work, when I reproduce it (8 RTX3090 GPUs), it performs 66.15% mAP and 70.41% NDS on nuscenes validation set,the performance is lower than the claimed 66.3% mAP and 71.0% NDS (especially NDS). I dont know the reason for this, is it because Transfusion_L head in OpenPCDet framework does not perform as well as mmdetection3d? Can you offer some advice on reproducing the work, much appreciated!

Sorry for the late reply. Yesterday, I released the model checkpoint and training log (mAP 66.5% | NDS 70.9%, trained with the released code on 2024/05/09). As I mentioned in the README, since I refactored the code to unify the codebase of HEDNet and SAFDNet, I only ran one experiment to validate the correctness of the code. Last night, I ran another experiment on the nuScenes dataset (mAP 66.15% and NDS 70.51%) and found that the results fluctuated. I will try to find the reason and fix the performance gap. I apologize for any confusion caused. Also, I welcome verification of our results on the Waymo dataset.

Thanks for the reply! I probably get it. If possible, can you provide the reproducible code before it can be refactored, it's much appreciated, I'm now trying to reproduce SAFD-Net on nuscenes' test dataset with 8 4090GPUs, but the reproduced mAP is only 67.44 and NDS is only 71.80 at the moment (without using double flip).

We push a new dev branch, where we update the code and configs to align the performance in the paper. We conducted experiments under various settings for SAFDNet on the nuscenes dataset and achieved an NDS of 70.7~71.2 (71.0 in the paper). The experimental results on the nuscenes dataset are unstable (including our original implementation), you may need to run more than one experiment to achieve good results.

Please note that we use a 2D sparse backbone in the release code because the original implementation of TransFusion-L in the OpenPCDet achieved worse results, NDS 69.4. When conducting the model inference on the test set, please set the number of queries in the TransFusion Head to 300 (test only), following the TransFusion paper. We are still conducting experiments to validate the other models for both HEDNet and SAFDNet.

zhanggang001 avatar Sep 03 '24 15:09 zhanggang001

Can the blogger lend me cards to increase my production rate? 4090, A100, or A800 would all be fine. I feel like I could make it into the top ten of the nuScenes leaderboard with this.

WenxuanLi-whu avatar Sep 03 '24 16:09 WenxuanLi-whu

Hello, I'm very interested in this safdnet work, when I reproduce it (8 RTX3090 GPUs), it performs 66.15% mAP and 70.41% NDS on nuscenes validation set,the performance is lower than the claimed 66.3% mAP and 71.0% NDS (especially NDS). I dont know the reason for this, is it because Transfusion_L head in OpenPCDet framework does not perform as well as mmdetection3d? Can you offer some advice on reproducing the work, much appreciated!

Sorry for the late reply. Yesterday, I released the model checkpoint and training log (mAP 66.5% | NDS 70.9%, trained with the released code on 2024/05/09). As I mentioned in the README, since I refactored the code to unify the codebase of HEDNet and SAFDNet, I only ran one experiment to validate the correctness of the code. Last night, I ran another experiment on the nuScenes dataset (mAP 66.15% and NDS 70.51%) and found that the results fluctuated. I will try to find the reason and fix the performance gap. I apologize for any confusion caused. Also, I welcome verification of our results on the Waymo dataset.

Thanks for the reply! I probably get it. If possible, can you provide the reproducible code before it can be refactored, it's much appreciated, I'm now trying to reproduce SAFD-Net on nuscenes' test dataset with 8 4090GPUs, but the reproduced mAP is only 67.44 and NDS is only 71.80 at the moment (without using double flip).

Sorry , I did not tell that I use only train_gt_database(without val gt_database), this may be the main reason. My apologies.

zheng0819 avatar Sep 04 '24 03:09 zheng0819

您好,我对 safdnet 的这个工作非常感兴趣,当我重现它(8 个 RTX3090 GPU)时,它在 nuscenes 验证集上执行 66.15% mAP 和 70.41% NDS,性能低于声称的 66.3% mAP 和 71.0% NDS(尤其是 NDS)。我不知道这是什么原因,是因为 OpenPCDet 框架中的 Transfusion_L 头的性能不如 mmdetection3d 吗?您能否提供一些关于重现这项工作的建议,非常感谢!

抱歉回复迟了。昨天,我发布了模型检查点和训练日志(mAP 66.5% | NDS 70.9%,使用 2024/05/09 发布的代码进行训练)。正如我在 README 中提到的,由于我重构了代码以统一 HEDNet 和 SAFDNet 的代码库,因此我只运行了一个实验来验证代码的正确性。昨晚,我在 nuScenes 数据集上进行了另一项实验(mAP 66.15% 和 NDS 70.51%),发现结果波动很大。我会尝试找出原因并修复性能差距。对于造成的任何混乱,我深表歉意。此外,我欢迎在 Waymo 数据集上验证我们的结果。

谢谢回复!我大概明白了。如果可以的话,能否在重构之前提供可重现的代码,不胜感激,我现在​​正尝试在 nuscenes 的测试数据集上使用 8 个 4090GPU 重现 SAFD-Net,但目前重现的 mAP 只有 67.44,NDS 只有 71.80(不使用双翻转)。

抱歉,我没有说我只使用了 train_gt_database(没有 val gt_database),这可能是主要原因。我很抱歉。

Do you mean that the author's nuscenes test results were trained using the train and val datasets?

asd291614761 avatar Sep 12 '24 07:09 asd291614761

您好,我对 safdnet 的这个工作非常感兴趣,当我重现它(8 个 RTX3090 GPU)时,它在 nuscenes 验证集上执行 66.15% mAP 和 70.41% NDS,性能低于声称的 66.3% mAP 和 71.0% NDS(尤其是 NDS)。我不知道这是什么原因,是因为 OpenPCDet 框架中的 Transfusion_L 头的性能不如 mmdetection3d 吗?您能否提供一些关于重现这项工作的建议,非常感谢!

抱歉回复迟了。昨天,我发布了模型检查点和训练日志(mAP 66.5% | NDS 70.9%,使用 2024/05/09 发布的代码进行训练)。正如我在 README 中提到的,由于我重构了代码以统一 HEDNet 和 SAFDNet 的代码库,因此我只运行了一个实验来验证代码的正确性。昨晚,我在 nuScenes 数据集上进行了另一项实验(mAP 66.15% 和 NDS 70.51%),发现结果波动很大。我会尝试找出原因并修复性能差距。对于造成的任何混乱,我深表歉意。此外,我欢迎在 Waymo 数据集上验证我们的结果。

谢谢回复!我大概明白了。如果可以的话,能否在重构之前提供可重现的代码,不胜感激,我现在​​正尝试在 nuscenes 的测试数据集上使用 8 个 4090GPU 重现 SAFD-Net,但目前重现的 mAP 只有 67.44,NDS 只有 71.80(不使用双翻转)。

抱歉,我没有说我只使用了 train_gt_database(没有 val gt_database),这可能是主要原因。我很抱歉。

Do you mean that the author's nuscenes test results were trained using the train and val datasets?

Yes, it is a common setting on the nuScenes dataset.

zhanggang001 avatar Sep 12 '24 17:09 zhanggang001