MonoDETR icon indicating copy to clipboard operation
MonoDETR copied to clipboard

CUDA out of memory

Open KotlinWang opened this issue 1 year ago • 9 comments

Hello, very good work, I used a single 3090 to train MonoDETR and the "CUDA out of memory" prompt appeared. All my configurations use the default monodetr.yaml settings, and my environment configuration is also in accordance with the requirements of README.md, but what is the reason for such a problem during the training phase? Very much looking forward to your reply, thank you!

KotlinWang avatar Oct 06 '23 12:10 KotlinWang

Same problem encountered.

charmeleonz avatar Oct 12 '23 08:10 charmeleonz

I can only set the batch size to 14 using a single 3090 graphics card, and the network training is very unstable.

KotlinWang avatar Oct 12 '23 17:10 KotlinWang

Same problem encountered!

yjy4231 avatar Oct 15 '23 05:10 yjy4231

Can I see the results of your reproduction? I used a 3090 graphics card with a batch size of 14 to get 17 AP_40 results. ---- Replied Message @.>Date10/15/2023 13:24 @.> @.>@.>SubjectRe: [ZrrSkywalker/MonoDETR] CUDA out of memory (Issue #42) Same problem encountered!

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you authored the thread.Message ID: @.> [ { @.": "http://schema.org", @.": "EmailMessage", "potentialAction": { @.": "ViewAction", "target": "https://github.com/ZrrSkywalker/MonoDETR/issues/42#issuecomment-1763278222", "url": "https://github.com/ZrrSkywalker/MonoDETR/issues/42#issuecomment-1763278222", "name": "View Issue" }, "description": "View this Issue on GitHub", "publisher": { @.***": "Organization", "name": "GitHub", "url": "https://github.com" } } ]

KotlinWang avatar Oct 16 '23 09:10 KotlinWang

Car [email protected], 0.70, 0.70: bbox AP:90.4341, 88.1947, 79.9611 bev AP:39.2351, 30.7552, 26.5470 3d AP:28.5708, 22.4689, 20.4412 aos AP:89.67, 86.37, 77.71 Car [email protected], 0.70, 0.70: bbox AP:96.1279, 89.7959, 82.5666 bev AP:37.3690, 26.5359, 22.8405 3d AP:26.4230, 19.8301, 16.8303 aos AP:95.24, 87.83, 80.01 Car [email protected], 0.50, 0.50: bbox AP:90.4341, 88.1947, 79.9611 bev AP:71.6413, 53.7894, 47.6267 3d AP:65.7944, 48.1693, 45.8162 aos AP:89.67, 86.37, 77.71 Car [email protected], 0.50, 0.50: bbox AP:96.1279, 89.7959, 82.5666 bev AP:71.5228, 52.7067, 46.6121 3d AP:67.7813, 48.3621, 43.4522 aos AP:95.24, 87.83, 80.01

I only get the AP40 result of Mod. level is 19.81.

yjy4231 avatar Oct 16 '23 10:10 yjy4231

Car [email protected], 0.70, 0.70: bbox AP:90.4341, 88.1947, 79.9611 bev AP:39.2351, 30.7552, 26.5470 3d AP:28.5708, 22.4689, 20.4412 aos AP:89.67, 86.37, 77.71 Car [email protected], 0.70, 0.70: bbox AP:96.1279, 89.7959, 82.5666 bev AP:37.3690, 26.5359, 22.8405 3d AP:26.4230, 19.8301, 16.8303 aos AP:95.24, 87.83, 80.01 Car [email protected], 0.50, 0.50: bbox AP:90.4341, 88.1947, 79.9611 bev AP:71.6413, 53.7894, 47.6267 3d AP:65.7944, 48.1693, 45.8162 aos AP:89.67, 86.37, 77.71 Car [email protected], 0.50, 0.50: bbox AP:96.1279, 89.7959, 82.5666 bev AP:71.5228, 52.7067, 46.6121 3d AP:67.7813, 48.3621, 43.4522 aos AP:95.24, 87.83, 80.01

I only get the AP40 result of Mod. level is 19.81.

Hello, may I know your graphics device model?

KotlinWang avatar Oct 16 '23 13:10 KotlinWang

Car [email protected], 0.70, 0.70: bbox AP:90.4341, 88.1947, 79.9611 bev AP:39.2351, 30.7552, 26.5470 3d AP:28.5708, 22.4689, 20.4412 aos AP:89.67, 86.37, 77.71 Car [email protected], 0.70, 0.70: bbox AP:96.1279, 89.7959, 82.5666 bev AP:37.3690, 26.5359, 22.8405 3d AP:26.4230, 19.8301, 16.8303 aos AP:95.24, 87.83, 80.01 Car [email protected], 0.50, 0.50: bbox AP:90.4341, 88.1947, 79.9611 bev AP:71.6413, 53.7894, 47.6267 3d AP:65.7944, 48.1693, 45.8162 aos AP:89.67, 86.37, 77.71 Car [email protected], 0.50, 0.50: bbox AP:96.1279, 89.7959, 82.5666 bev AP:71.5228, 52.7067, 46.6121 3d AP:67.7813, 48.3621, 43.4522 aos AP:95.24, 87.83, 80.01 I only get the AP40 result of Mod. level is 19.81.

Hello, may I know your graphics device model?

a single 3090 GPU with batch_size=14

yjy4231 avatar Oct 16 '23 13:10 yjy4231

The original version is for the 3090, while the stable version is for the A100. With the skill of Group DETR, the cuda memory could reach 40G.

Ivan-Tang-3D avatar Nov 24 '23 15:11 Ivan-Tang-3D

If u want to adapt the model to 3090, u could set the group_detr param in cfg to 1,and comment the lines of 467-473(about conditional) in the https://github.com/ZrrSkywalker/MonoDETR/blob/main/lib/models/monodetr/depthaware_transformer.py, then the model turns to the original version.

Ivan-Tang-3D avatar Dec 17 '23 05:12 Ivan-Tang-3D