Lidar_AI_Solution
Lidar_AI_Solution copied to clipboard
Inference time 25 FPS
Hello! I am trying to get this 25 FPS on the ORIN, and I am following the steps in the readme. I am using the model (resnet50 int8 onnx and PTQ models) from the zip, but with this model the inference time I got is around 17Hz :
==================BEVFusion===================
[⏰ [NoSt] CopyLidar]: 1.51542 ms
[⏰ [NoSt] ImageNrom]: 7.68278 ms
[⏰ Lidar Backbone]: 48.93895 ms
[⏰ Camera Depth]: 0.15978 ms
[⏰ Camera Backbone]: 18.95363 ms
[⏰ Camera Bevpool]: 1.99872 ms
[⏰ VTransform]: 2.18362 ms
[⏰ Transfusion]: 15.96915 ms
[⏰ Head BoundingBox]: 17.20211 ms
Total: 105.406 ms
=============================================
==================BEVFusion===================
[⏰ [NoSt] CopyLidar]: 0.85197 ms
[⏰ [NoSt] ImageNrom]: 8.72378 ms
[⏰ Lidar Backbone]: 18.55526 ms
[⏰ Camera Depth]: 0.10973 ms
[⏰ Camera Backbone]: 14.36374 ms
[⏰ Camera Bevpool]: 1.55261 ms
[⏰ VTransform]: 1.65760 ms
[⏰ Transfusion]: 7.19910 ms
[⏰ Head BoundingBox]: 11.77533 ms
Total: 55.213 ms
=============================================
==================BEVFusion===================
[⏰ [NoSt] CopyLidar]: 0.56618 ms
[⏰ [NoSt] ImageNrom]: 5.23469 ms
[⏰ Lidar Backbone]: 20.24752 ms
[⏰ Camera Depth]: 0.10736 ms
[⏰ Camera Backbone]: 12.72925 ms
[⏰ Camera Bevpool]: 1.56182 ms
[⏰ VTransform]: 1.64019 ms
[⏰ Transfusion]: 8.37146 ms
[⏰ Head BoundingBox]: 10.38650 ms
Total: 55.044 ms
=============================================
==================BEVFusion===================
[⏰ [NoSt] CopyLidar]: 1.48154 ms
[⏰ [NoSt] ImageNrom]: 5.11184 ms
[⏰ Lidar Backbone]: 19.40304 ms
[⏰ Camera Depth]: 0.10710 ms
[⏰ Camera Backbone]: 15.52259 ms
[⏰ Camera Bevpool]: 1.52963 ms
[⏰ VTransform]: 1.67616 ms
[⏰ Transfusion]: 8.33219 ms
[⏰ Head BoundingBox]: 14.06886 ms
Total: 60.640 ms
=============================================
==================BEVFusion===================
[⏰ [NoSt] CopyLidar]: 0.38928 ms
[⏰ [NoSt] ImageNrom]: 3.36355 ms
[⏰ Lidar Backbone]: 18.56685 ms
[⏰ Camera Depth]: 0.10810 ms
[⏰ Camera Backbone]: 14.83594 ms
[⏰ Camera Bevpool]: 1.55382 ms
[⏰ VTransform]: 1.70230 ms
[⏰ Transfusion]: 7.22714 ms
[⏰ Head BoundingBox]: 14.19894 ms
Total: 58.193 ms
=============================================
==================BEVFusion===================
[⏰ [NoSt] CopyLidar]: 0.38029 ms
[⏰ [NoSt] ImageNrom]: 5.25728 ms
[⏰ Lidar Backbone]: 17.51168 ms
[⏰ Camera Depth]: 0.10851 ms
[⏰ Camera Backbone]: 15.91882 ms
[⏰ Camera Bevpool]: 1.55072 ms
[⏰ VTransform]: 1.69590 ms
[⏰ Transfusion]: 7.21994 ms
[⏰ Head BoundingBox]: 12.86070 ms
Total: 56.866 ms
=============================================
What steps I have to follow to get this 25 FPS?. I am using a Jetson AGX Orin Developer Kit with this versions:
- Jetpack: 5.2.1
- L4T: 35.4.1
- CUDA: 11.4.315
- cuDNN: 8.6.0.166
- TensorRT: 8.5.2.2 And the output is generated following the steps of the repository with the example-data. Thanks you so much.
You should check your device freq. For example: run jetson_clocks.
Hello, thanks you for your answer. I have been doing some tests, but I do not Know how to continue with the configuration of the frequency. I think that everything seems to be okay:
sudo jetson_clocks --show
SOC family:tegra234 Machine:Jetson AGX Orin Developer Kit
Online CPUs: 0-11
cpu0: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu1: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu2: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu3: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu4: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu5: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu6: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu7: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu8: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu9: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu10: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
cpu11: Online=1 Governor=schedutil MinFreq=2201600 MaxFreq=2201600 CurrentFreq=2201600 IdleStates: WFI=0 c7=0
GPU MinFreq=1300500000 MaxFreq=1300500000 CurrentFreq=1300500000
EMC MinFreq=204000000 MaxFreq=3199000000 CurrentFreq=3199000000 FreqOverride=1
DLA0_CORE: Online=1 MinFreq=0 MaxFreq=1600000000 CurrentFreq=1600000000
DLA0_FALCON: Online=1 MinFreq=0 MaxFreq=844800000 CurrentFreq=844800000
DLA1_CORE: Online=1 MinFreq=0 MaxFreq=1600000000 CurrentFreq=1600000000
DLA1_FALCON: Online=1 MinFreq=0 MaxFreq=844800000 CurrentFreq=844800000
PVA0_VPS0: Online=1 MinFreq=0 MaxFreq=1152000000 CurrentFreq=1152000000
PVA0_AXI: Online=1 MinFreq=0 MaxFreq=832000000 CurrentFreq=832000000
FAN Dynamic Speed control=active hwmon2_pwm1=56
NV Power Mode: MAXN
And the mean time on the terminal output is:
Mean: 54.629 ms
But when I save the output on a txt instead of visualizing on the terminal , the time is reduced:
Mean: 48.823 ms
Here is my jtop configuration when jetson_clock is running.
I think that I am near this 40 ms (25FPS), but something of the configuration is missing in my case.
Hi, may I ask what the mAP you got? I can only get 0.5753 mAP after I run test-mAP-for-cuda.py and don't know why. Thank you so much!
similar issue, i got 60ms+ when running on ORIN and using the author's instructions. Lidar Backbone and camera backbone cost to much time. Can you guys give me some points to check? Thanks @dav695 @hopef
[⏰ [NoSt] CopyLidar]: 0.40496 ms [⏰ [NoSt] ImageNrom]: 0.59677 ms [⏰ Lidar Backbone]: 28.70608 ms [⏰ Camera Depth]: 3.53670 ms [⏰ Camera Backbone]: 8.67360 ms [⏰ Camera Bevpool]: 1.45706 ms [⏰ VTransform]: 1.74602 ms [⏰ Transfusion]: 7.06768 ms [⏰ Head BoundingBox]: 10.52410 ms
Total: 61.782 ms
You should make sure your device is running on MAXN mode.
sudo nvpmodel -q
NV Power Mode: MAXN
Change to MAXN can use the command: sudo nvpmodel -m 0
.
You should make sure your device is running on MAXN mode.
sudo nvpmodel -q NV Power Mode: MAXN
Change to MAXN can use the command:
sudo nvpmodel -m 0
.
I have try to use sudo nvpmodel -m 0 to change the power model to MAXN,but it seems to be failed: sudo nvpmodel -m 0 NVPM ERROR: null input file! NVPM ERROR: Failed to parse pm.conf
i also compare the same model and code btw x86-based computer (total time cost is ~34ms) and orin (total time cost is ~62ms) , the sms and memory clock rate show big diff: [11/09/2023-10:12:43] [I] === Device Information === [11/09/2023-10:12:43] [I] Selected Device: NVIDIA GeForce RTX 3060 Laptop GPU [11/09/2023-10:12:43] [I] Compute Capability: 8.6 [11/09/2023-10:12:43] [I] SMs: 30 [11/09/2023-10:12:43] [I] Compute Clock Rate: 1.425 GHz [11/09/2023-10:12:43] [I] Device Global Memory: 5946 MiB [11/09/2023-10:12:43] [I] Shared Memory per SM: 100 KiB [11/09/2023-10:12:43] [I] Memory Bus Width: 192 bits (ECC disabled) [11/09/2023-10:12:43] [I] Memory Clock Rate: 7.001 GHz
[11/08/2023-09:31:00] [I] === Device Information === [11/08/2023-09:31:00] [I] Selected Device: Orin [11/08/2023-09:31:00] [I] Compute Capability: 8.7 [11/08/2023-09:31:00] [I] SMs: 16 [11/08/2023-09:31:00] [I] Compute Clock Rate: 1.275 GHz [11/08/2023-09:31:00] [I] Device Global Memory: 24845 MiB [11/08/2023-09:31:00] [I] Shared Memory per SM: 164 KiB [11/08/2023-09:31:00] [I] Memory Bus Width: 128 bits (ECC disabled) [11/08/2023-09:31:00] [I] Memory Clock Rate: 1.275 GHz
yeah,my processing is same with yours up to 50ms,but I have chosen the maxn model, I really want to know how to achieve the same fps in the project