Bi3D icon indicating copy to clipboard operation
Bi3D copied to clipboard

Inference time jetson nano

Open sieuwe1 opened this issue 3 years ago • 5 comments

What would be expected inference time on a jetson nano? With 4 to 5 different depth levels.

Thanks

Sieuwe

sieuwe1 avatar Dec 22 '20 23:12 sieuwe1

For what it's worth, here are inference tests using the default configuration on Jetson Nano 2GB:

SCENEFLOW

CUDA_VISIBLE_DEVICES=0 python3 run_binary_depth_estimation.py \
    --arch bi3dnet_binary_depth \
    --bi3dnet_featnet_arch featextractnetspp \
    --bi3dnet_featnethr_arch featextractnethr \
    --bi3dnet_segnet_arch segnet2d \
    --bi3dnet_refinenet_arch segrefinenet \
    --featextractnethr_out_planes 16 \
    --segrefinenet_in_planes 17 \
    --segrefinenet_out_planes 8 \
    --crop_height 576 --crop_width 960 \
    --disp_vals 24 36 54 96 144 \
    --img_left  '../data/sf_img_left.jpg' \
    --img_right '../data/sf_img_right.jpg' \
    --pretrained '../model_weights/sf_binary_depth.pth.tar'

Measured with time: real 2m24.842s sf_img_left_bi3dnet_binary_depth_quant_depth_0-24-36-54-96-144-192

KITTI15

CUDA_VISIBLE_DEVICES=0 python3 run_binary_depth_estimation.py \
    --arch bi3dnet_binary_depth \
    --bi3dnet_featnet_arch featextractnetspp \
    --bi3dnet_featnethr_arch featextractnethr \
    --bi3dnet_segnet_arch segnet2d \
    --bi3dnet_refinenet_arch segrefinenet \
    --featextractnethr_out_planes 16 \
    --segrefinenet_in_planes 17 \
    --segrefinenet_out_planes 8 \
    --crop_height 384 --crop_width 1248 \
    --disp_vals 12 21 30 39 48 \
    --img_left  '../data/kitti15_img_left.jpg' \
    --img_right '../data/kitti15_img_right.jpg' \
    --pretrained '../model_weights/kitti15_binary_depth.pth.tar'

Measured with time: real 1m41.397s kitti15_img_left_bi3dnet_binary_depth_quant_depth_0-12-21-30-39-48-192

Based on my experience Jetson Nano is not powerful enough to run Bi3D properly. The main issue is the lack of memory - you can't run the continuous mode at all as it gets killed by the system (examples above are from the quantized mode). It is possible that the 4GB model could run also the continuous mode.

jankais3r avatar Jan 25 '21 00:01 jankais3r

@jankais3r thanks for the performance results. Looks indeed like the jetson nano 2gb won't be able to do it. What amount of depth levels did you use and what resolution?

A gtx1080 would be definitely strong enough for continuous mode?

Thanks

Sieuwe

sieuwe1 avatar Jan 26 '21 15:01 sieuwe1

I don't have access to any other Nvidia GPU besides the Jetson Nano, but I tried to run Bi3D in CPU-only mode on my Macbook. I was able to generate a map with bi3dnet_continuous_depth_2D, and the process utilized up to 19.9GB of RAM at one point. I also tried to run bi3dnet_continuous_depth_3D, which used up to 33.5GB RAM and then my system ran out of memory and the process was killed by the OS, so I don't know how much more memory it would need. I assume that if you would run this on a GPU, large portion of this memory would come from the GPU and not the system RAM, but AFAIK all Jetsons have unified memory design, and based on this test none of them has enough memory to run this inference in full. Screen Shot 2021-01-25 at 17 16 25

jankais3r avatar Jan 30 '21 17:01 jankais3r

Hi @jankais3r i am trying to use it in jetson xavier nx can you let me know which version of docker you are using as it is recommended 19.03.11 and I am having 20.10.7....while building docker i am having issue...

EhrazImam avatar Sep 13 '21 06:09 EhrazImam

Hi, I cannot comment on the Docker approach, as I used Conda environment for my tests.

jankais3r avatar Sep 14 '21 12:09 jankais3r