TensorRT-Yolov3
TensorRT-Yolov3 copied to clipboard
[Training] Can you give me the github for model train on caffe
Hi @lewes6369 , I'm very happy because your git help me so much on my work. But can you give me what link you was use for trainning. I'm a noob in caffe so i was try convert from keras or tensorflow to caffe but it so hard and i meet so many bug with that. Can you help me, please.
Hi, @cong235 ,I am happy for the help to your work. You can train the darknet model use the official yolov3 git :https://github.com/pjreddie/darknet. Next convert them to caffemodel by git and git. And also you can try this https://github.com/eric612/MobileNet-YOLO ,training the yolo model directly by caffe framework
Hi, @cong235 ,I am happy for the help to your work. You can train the darknet model use the official yolov3 git :https://github.com/pjreddie/darknet. Next convert them to caffemodel by git and git. And also you can try this https://github.com/eric612/MobileNet-YOLO ,training the yolo model directly by caffe framework
I got it. Thanks for your work :D
Hi, @lewes6369 , I can resolve my problem but i have 1 question because i dont know about that. When i use pretrain yolov3 from darknet. I just train 1 class is my cane. So i config it to train with 1 class, and set filters=(classes + 5)*3 = (1+5) * 3. Then when i converted it to caffe model. i use this follow cmd for detect my cane : ./install/runYolov3 --caffemodel=./model/yolov3_cane.caffemodel --prototxt=./model/yolov3_cane_trt.prototxt --W=416 --H=416 --class=1 --mode=fp16 --input=./doge.jpg It will be got core cumped. But if i change --class=80 it will be ok, and class 0 change from human like your yolo3_model to cane with my model. And also i just train 1 class but it can detect too many other object like coco model. What does it mean ? And can i fix output model just detect only 1 class is my cane, because i think it can increase my performance. Thanks for reading!
Hi,@cong235 . Do you modify the CLASS_NUM in file tensorRTWrapper/code/include/YoloConfigs.h to class one? Not only the cmd but also this header need to change the class num. I will merge these two num setting in the cmd later.
Hi,@cong235 . Do you modify the CLASS_NUM in file
tensorRTWrapper/code/include/YoloConfigs.hto class one? Not only the cmd but also this header need to change the class num. I will merge these two num setting in the cmd later.
Thanks for your reply. I got it, i forgot it :D p/s: Hmm, can i write custom layer without coding for cuda ?
Yes. If you did not coding for cuda, you have to run the custom layer on cpu and it will cost time over the communication between cpu memory and gpu device. The last low computing layer doing in cpu is just ok.
Yes. If you did not coding for cuda, you have to run the custom layer on cpu and it will cost time over the communication between cpu memory and gpu device. The last low computing layer doing in cpu is just ok.
It mean after convert caffe to engine TRT, that layer will just run with CPU instead GPU right ?
Convert to engine TRT ,all support layers will run on GPU. The customer layer can be run both cpu and gpu by your implementation. If you want to run on gpu, just code as cuda . See codes in YoloLayer.cu,It contains both cpu and gpu implementation.
Convert to engine TRT ,all support layers will run on GPU. The customer layer can be run both cpu and gpu by your implementation. If you want to run on gpu, just code as cuda . See codes in
YoloLayer.cu,It contains both cpu and gpu implementation.
Thanks for your reply. Hmm i was try to use the same config protox with yolo_tiny. But it's not working. So have some difference between yolov3 vs yolo_tini ? I need change the code in tensorRT wrapper right ?
Hi @lewes6369 , i'm working with Jetson TX2 for real-time detection, with batchsize = 1 and yolov3 it took 140ms per image ~ 7FPS. And now i use batchsize = 4 and it took 530ms for 1 batch ~ 7FPS too. Can you give some suggestion for increase it ? my goal is 15 FPS whith batchsize = 4. I'm trying to decrease yolo kernel and CHECK_COUNT and number of anchor box but still cant increase so much. Do you have any suggest ? Thanks for reading :dancer:
Thanks for your reply. Hmm i was try to use the same config protox with yolo_tiny. But it's not working. So have some difference between yolov3 vs yolo_tini ? I need change the code in tensorRT wrapper right ?
Hi cong235,
I am also trying to run the tiny-yolo-3l.cfg . Have you already managed to change the TensorRT wrapper for tiny yolo? Can you please let me know the changes you did ?
THanks
Thanks for your reply. Hmm i was try to use the same config protox with yolo_tiny. But it's not working. So have some difference between yolov3 vs yolo_tini ? I need change the code in tensorRT wrapper right ?
Hi cong235,
I am also trying to run the tiny-yolo-3l.cfg . Have you already managed to change the TensorRT wrapper for tiny yolo? Can you please let me know the changes you did ?
THanks
yes, i was runing with yolov3_tiny. You should change the tensorrt_wrap and yoloConfig. YoloConfig.h //YOLO 416 YoloKernel yolo1 = { 13, 13, {81,82, 135,169, 344,319} }; YoloKernel yolo2 = { 26, 26, {10,14, 23,27, 37,58} };
And comment line : //mYoloKernel.push_back(yolo3); in YoloLayer.cu now you can run it
@cong235 - Actually I am running tiny-yolo-3l.cfg model. Accoriding to this model, I changed the Yoloconfig.h file with appropriate anchors and CLASS_NUM = 2 (because I have 2 classes to detect). But, the inference code doesn't give me any detections. Have you faced similar problem ? Do we need to change something else ? Thanks for yor help in advance !
@cong235 - Actually I am running tiny-yolo-3l.cfg model. Accoriding to this model, I changed the Yoloconfig.h file with appropriate anchors and CLASS_NUM = 2 (because I have 2 classes to detect). But, the inference code doesn't give me any detections. Have you faced similar problem ? Do we need to change something else ? Thanks for yor help in advance !
You need change your tiny.cfg too. comment the "upsample_param" blocks, and modify the prototxt the last layer as: layer { #the bottoms are the yolo input layers bottom: "layer16-conv" bottom: "layer23-conv" top: "yolo-det" name: "yolo-det" type: "Yolo" }
@cong235 I already did that ! Refer my .cfg file attached ! Still I can't get any detections. Maybe some hard-coded confidence thresholds? name: "Darkent2Caffe" input: "data" input_dim: 1 input_dim: 3 input_dim: 608 input_dim: 608
layer { bottom: "data" top: "layer1-conv" name: "layer1-conv" type: "Convolution" convolution_param { num_output: 16 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer1-conv" top: "layer1-conv" name: "layer1-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer1-conv" top: "layer1-conv" name: "layer1-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer1-conv" top: "layer1-conv" name: "layer1-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer1-conv" top: "layer2-maxpool" name: "layer2-maxpool" type: "Pooling" pooling_param { stride: 2 pool: MAX kernel_size: 2 pad: 0 } } layer { bottom: "layer2-maxpool" top: "layer3-conv" name: "layer3-conv" type: "Convolution" convolution_param { num_output: 32 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer3-conv" top: "layer3-conv" name: "layer3-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer3-conv" top: "layer3-conv" name: "layer3-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer3-conv" top: "layer3-conv" name: "layer3-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer3-conv" top: "layer4-maxpool" name: "layer4-maxpool" type: "Pooling" pooling_param { stride: 2 pool: MAX kernel_size: 2 pad: 0 } } layer { bottom: "layer4-maxpool" top: "layer5-conv" name: "layer5-conv" type: "Convolution" convolution_param { num_output: 64 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer5-conv" top: "layer5-conv" name: "layer5-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer5-conv" top: "layer5-conv" name: "layer5-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer5-conv" top: "layer5-conv" name: "layer5-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer5-conv" top: "layer6-maxpool" name: "layer6-maxpool" type: "Pooling" pooling_param { stride: 2 pool: MAX kernel_size: 2 pad: 0 } } layer { bottom: "layer6-maxpool" top: "layer7-conv" name: "layer7-conv" type: "Convolution" convolution_param { num_output: 128 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer7-conv" top: "layer7-conv" name: "layer7-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer7-conv" top: "layer7-conv" name: "layer7-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer7-conv" top: "layer7-conv" name: "layer7-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer7-conv" top: "layer8-maxpool" name: "layer8-maxpool" type: "Pooling" pooling_param { stride: 2 pool: MAX kernel_size: 2 pad: 0 } } layer { bottom: "layer8-maxpool" top: "layer9-conv" name: "layer9-conv" type: "Convolution" convolution_param { num_output: 256 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer9-conv" top: "layer9-conv" name: "layer9-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer9-conv" top: "layer9-conv" name: "layer9-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer9-conv" top: "layer9-conv" name: "layer9-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer9-conv" top: "layer10-maxpool" name: "layer10-maxpool" type: "Pooling" pooling_param { stride: 2 pool: MAX kernel_size: 2 pad: 0 } } layer { bottom: "layer10-maxpool" top: "layer11-conv" name: "layer11-conv" type: "Convolution" convolution_param { num_output: 512 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer11-conv" top: "layer11-conv" name: "layer11-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer11-conv" top: "layer11-conv" name: "layer11-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer11-conv" top: "layer11-conv" name: "layer11-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer11-conv" top: "layer12-maxpool" name: "layer12-maxpool" type: "Pooling" pooling_param { stride: 1 pool: MAX kernel_size: 3 pad: 1 } } layer { bottom: "layer12-maxpool" top: "layer13-conv" name: "layer13-conv" type: "Convolution" convolution_param { num_output: 1024 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer13-conv" top: "layer13-conv" name: "layer13-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer13-conv" top: "layer13-conv" name: "layer13-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer13-conv" top: "layer13-conv" name: "layer13-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer13-conv" top: "layer14-conv" name: "layer14-conv" type: "Convolution" convolution_param { num_output: 256 kernel_size: 1 pad: 0 stride: 1 bias_term: false } } layer { bottom: "layer14-conv" top: "layer14-conv" name: "layer14-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer14-conv" top: "layer14-conv" name: "layer14-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer14-conv" top: "layer14-conv" name: "layer14-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer14-conv" top: "layer15-conv" name: "layer15-conv" type: "Convolution" convolution_param { num_output: 512 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer15-conv" top: "layer15-conv" name: "layer15-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer15-conv" top: "layer15-conv" name: "layer15-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer15-conv" top: "layer15-conv" name: "layer15-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer15-conv" top: "layer16-conv" name: "layer16-conv" type: "Convolution" convolution_param { num_output: 21 kernel_size: 1 pad: 0 stride: 1 bias_term: true } } layer { bottom: "layer14-conv" top: "layer18-route" name: "layer18-route" type: "Concat" } layer { bottom: "layer18-route" top: "layer19-conv" name: "layer19-conv" type: "Convolution" convolution_param { num_output: 128 kernel_size: 1 pad: 0 stride: 1 bias_term: false } } layer { bottom: "layer19-conv" top: "layer19-conv" name: "layer19-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer19-conv" top: "layer19-conv" name: "layer19-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer19-conv" top: "layer19-conv" name: "layer19-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer19-conv" top: "layer20-upsample" name: "layer20-upsample" type: "Upsample" #upsample_param { # scale: 2 #} } layer { bottom: "layer20-upsample" bottom: "layer9-conv" top: "layer21-route" name: "layer21-route" type: "Concat" } layer { bottom: "layer21-route" top: "layer22-conv" name: "layer22-conv" type: "Convolution" convolution_param { num_output: 256 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer22-conv" top: "layer22-conv" name: "layer22-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer22-conv" top: "layer22-conv" name: "layer22-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer22-conv" top: "layer22-conv" name: "layer22-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer22-conv" top: "layer23-conv" name: "layer23-conv" type: "Convolution" convolution_param { num_output: 21 kernel_size: 1 pad: 0 stride: 1 bias_term: true } } layer { bottom: "layer22-conv" top: "layer25-route" name: "layer25-route" type: "Concat" } layer { bottom: "layer25-route" top: "layer26-conv" name: "layer26-conv" type: "Convolution" convolution_param { num_output: 128 kernel_size: 1 pad: 0 stride: 1 bias_term: false } } layer { bottom: "layer26-conv" top: "layer26-conv" name: "layer26-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer26-conv" top: "layer26-conv" name: "layer26-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer26-conv" top: "layer26-conv" name: "layer26-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer26-conv" top: "layer27-upsample" name: "layer27-upsample" type: "Upsample" #upsample_param { # scale: 2 #} } layer { bottom: "layer27-upsample" bottom: "layer7-conv" top: "layer28-route" name: "layer28-route" type: "Concat" } layer { bottom: "layer28-route" top: "layer29-conv" name: "layer29-conv" type: "Convolution" convolution_param { num_output: 128 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer29-conv" top: "layer29-conv" name: "layer29-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer29-conv" top: "layer29-conv" name: "layer29-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer29-conv" top: "layer29-conv" name: "layer29-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer29-conv" top: "layer30-conv" name: "layer30-conv" type: "Convolution" convolution_param { num_output: 21 kernel_size: 1 pad: 0 stride: 1 bias_term: true } }
layer { bottom: "layer16-conv" bottom: "layer23-conv" bottom: "layer30-conv" top: "yolo-det" name: "yolo-det" type: "Yolo" }
@aditbhrgv You just need do something if want use other yolo model:
- Edit file .cfg
- Change numclass
- Edit YoloKernel with filter size match with anchor box
- Check your model again.
@cong235 thank you ! It works now. However, my int8 mode is not working. I am using TensorRT 5.0.2.6 (is there some dependency on TensorRT version ?) I chose 10 images in calibration dataset which is a subset of my validation dataset.
####### input args####### C=3; H=608; W=608; batchsize=1; caffemodel=yolov3-3l.caffemodel; calib=calib_sample.txt; class=2; enginefile=; evallist=; input=000000.jpg; mode=int8; nms=0.450000; outputs=yolo-det; prototxt=yolov3-3l.prototxt; ####### end args####### find calibration file,loading ... init plugin proto: yolov3-3l.prototxt caffemodel: yolov3-3l.caffemodel create calibrator,Named:yolov3-3l Begin parsing model... End parsing model... setInt8Mode Begin building engine... End building engine... save Engine...yolov3_int8.engine process: 000000.jpg Time taken for inference is 9.37288 ms. Det count is 0 Time taken for nms is 0.001386 ms. layer1-conv input reformatter 0 0.038ms layer1-conv 0.130ms layer1-act 0.134ms layer2-maxpool 0.093ms layer3-conv input reformatter 0 0.037ms layer3-conv 0.071ms layer3-act 0.069ms layer4-maxpool 0.049ms layer5-conv input reformatter 0 0.020ms layer5-conv 0.046ms layer5-act 0.036ms layer6-maxpool 0.030ms layer7-conv input reformatter 0 0.011ms layer7-conv 0.038ms layer7-act 0.017ms layer8-maxpool 0.016ms layer9-conv input reformatter 0 0.006ms layer9-conv 0.046ms layer9-act 0.006ms layer10-maxpool 0.007ms layer11-conv input reformatter 0 0.004ms layer11-conv 0.070ms layer11-act 0.004ms layer12-maxpool 0.010ms layer13-conv input reformatter 0 0.005ms layer13-conv 0.137ms layer13-act input reformatter 0 0.011ms layer13-act 0.005ms layer14-conv input reformatter 0 0.008ms layer14-conv 0.038ms layer14-act 0.004ms layer14-act output reformatter 0 0.005ms layer15-conv 0.053ms layer15-act 0.004ms layer16-conv 0.016ms layer14-conv copy 0.006ms layer19-conv 0.012ms layer19-act 0.003ms layer20-upsample 0.010ms layer20-upsample copy 0.007ms layer9-conv copy 0.010ms layer22-conv 0.099ms layer22-act 0.006ms layer23-conv 0.015ms layer22-conv copy 0.009ms layer26-conv 0.023ms layer26-act 0.005ms layer27-upsample 0.029ms layer27-upsample copy 0.019ms layer7-conv copy 0.018ms layer29-conv 0.104ms layer29-act 0.017ms layer30-conv 0.031ms yolo-det 6.559ms Time over all layers: 8.263
@aditbhrgv Hi: I'm also trying to use TensorRt to do real time object detection tasks with TX2 recently and I believe TX2 does not support int8 mode. It works for xavier and some GeForce cards.
@aditbhrgv not support is not running bro. "Not support" mean you can't run it with full rate at INT8. But in real-life some model can run very fast with INT8. Yolov3-tiny can run with 66FPS on TX2 with INT8
Hi, @cong235 ,I am happy for the help to your work. You can train the darknet model use the official yolov3 git :https://github.com/pjreddie/darknet. Next convert them to caffemodel by git and git. And also you can try this https://github.com/eric612/MobileNet-YOLO ,training the yolo model directly by caffe framework
I got it. Thanks for your work :D
Hi, It seems that https://github.com/ChenYingpeng/caffe-yolov3/tree/master/model_convert doesn't exist, is there any other methods?
@cong235 thank you ! It works now. However, my int8 mode is not working. I am using TensorRT 5.0.2.6 (is there some dependency on TensorRT version ?) I chose 10 images in calibration dataset which is a subset of my validation dataset.
####### input args####### C=3; H=608; W=608; batchsize=1; caffemodel=yolov3-3l.caffemodel; calib=calib_sample.txt; class=2; enginefile=; evallist=; input=000000.jpg; mode=int8; nms=0.450000; outputs=yolo-det; prototxt=yolov3-3l.prototxt; ####### end args####### find calibration file,loading ... init plugin proto: yolov3-3l.prototxt caffemodel: yolov3-3l.caffemodel create calibrator,Named:yolov3-3l Begin parsing model... End parsing model... setInt8Mode Begin building engine... End building engine... save Engine...yolov3_int8.engine process: 000000.jpg Time taken for inference is 9.37288 ms. Det count is 0 Time taken for nms is 0.001386 ms. layer1-conv input reformatter 0 0.038ms layer1-conv 0.130ms layer1-act 0.134ms layer2-maxpool 0.093ms layer3-conv input reformatter 0 0.037ms layer3-conv 0.071ms layer3-act 0.069ms layer4-maxpool 0.049ms layer5-conv input reformatter 0 0.020ms layer5-conv 0.046ms layer5-act 0.036ms layer6-maxpool 0.030ms layer7-conv input reformatter 0 0.011ms layer7-conv 0.038ms layer7-act 0.017ms layer8-maxpool 0.016ms layer9-conv input reformatter 0 0.006ms layer9-conv 0.046ms layer9-act 0.006ms layer10-maxpool 0.007ms layer11-conv input reformatter 0 0.004ms layer11-conv 0.070ms layer11-act 0.004ms layer12-maxpool 0.010ms layer13-conv input reformatter 0 0.005ms layer13-conv 0.137ms layer13-act input reformatter 0 0.011ms layer13-act 0.005ms layer14-conv input reformatter 0 0.008ms layer14-conv 0.038ms layer14-act 0.004ms layer14-act output reformatter 0 0.005ms layer15-conv 0.053ms layer15-act 0.004ms layer16-conv 0.016ms layer14-conv copy 0.006ms layer19-conv 0.012ms layer19-act 0.003ms layer20-upsample 0.010ms layer20-upsample copy 0.007ms layer9-conv copy 0.010ms layer22-conv 0.099ms layer22-act 0.006ms layer23-conv 0.015ms layer22-conv copy 0.009ms layer26-conv 0.023ms layer26-act 0.005ms layer27-upsample 0.029ms layer27-upsample copy 0.019ms layer7-conv copy 0.018ms layer29-conv 0.104ms layer29-act 0.017ms layer30-conv 0.031ms yolo-det 6.559ms Time over all layers: 8.263
Hi, why did your initial tiny model not work? I just converted the official yolov3-tiny.cfg/weights and made the change in YoloConfig.h like this: //YOLO 416
YoloKernel yolo1 = { 13, 13, {81,82, 135,169, 344,319} }; YoloKernel yolo2 = { 26, 26, {10,14, 23,27, 37,58} }; but it doesn't give any detections? Can you share some your experiences?
@aditbhrgv You just need do something if want use other yolo model:
1. Edit file .cfg 2. Change numclass 3. Edit YoloKernel with filter size match with anchor box 4. Check your model again.
@cong235 Could you please explain what do you mean by step 3. filter size match with anchor box Where do I get the filter size from my yolov3.cfg file? Also I have 9 pairs of anchor points, how do I change my YoloKernel? My config: Num_classes = 9 filters: 42 (original yolov3 = 255) anchors = 11.3950,25.3481, 21.0826,48.1415, 29.8289,76.1553, 35.8586,132.5984, 66.4218,89.7861, 92.6243,139.2145, 164.5912,141.0014, 140.5117,216.2245, 238.7294,323.5685 Could you please show me an example?
@prajwaljpj Hi, What is your kernel feature size? if 416 input, try this: YoloKernel yolo1 = { 13, 13, {164.5912,141.0014, 140.5117,216.2245, 238.7294,323.5685} }; YoloKernel yolo2 = { 26, 26, {35.8586,132.5984, 66.4218,89.7861, 92.6243,139.2145} }; YoloKernel yolo3 = { 52, 52, {11.3950,25.3481, 21.0826,48.1415, 29.8289,76.1553} }; and make sure the anchors are below to according kernel.
@lewes6369 Thanks for the quick reply. Yes the kernel feature size is 416 and It worked!
@cong235 Hello, My yolov3-tiny always reports an error: ####### input args####### C=3; H=416; W=416; caffemodel=./yolov3-tiny.caffemodel; calib=; cam=0; class=80; classname=coco.name; display=1; evallist=; input=0; inputstream=cam; mode=fp32; nms=0.450000; outputs=yolo-det; prototxt=./yolov3-tiny-trt.prototxt; savefile=result; saveimg=0; videofile=sample.mp4; ####### end args####### init plugin proto: ./yolov3-tiny-trt.prototxt caffemodel: ./yolov3-tiny.caffemodel Begin parsing model... ERROR: layer21-route: all concat input tensors must have the same dimensions except on the concatenation axis runYolov3: ./parserHelper.h:97: nvinfer1::DimsCHW parserhelper::getCHW(const nvinfer1::Dims&): Assertion `d.nbDims >= 3' failed. Aborted (core dumped)
Can you give me your yolov3-tiny?
@cong235 My yolov3-tiny-trt.prototxt: name: "Darkent2Caffe" input: "data" input_dim: 1 input_dim: 3 input_dim: 416 input_dim: 416
layer { bottom: "data" top: "layer1-conv" name: "layer1-conv" type: "Convolution" convolution_param { num_output: 16 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer1-conv" top: "layer1-conv" name: "layer1-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer1-conv" top: "layer1-conv" name: "layer1-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer1-conv" top: "layer1-conv" name: "layer1-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer1-conv" top: "layer2-maxpool" name: "layer2-maxpool" type: "Pooling" pooling_param { stride: 2 pool: MAX kernel_size: 2 pad: 0 } } layer { bottom: "layer2-maxpool" top: "layer3-conv" name: "layer3-conv" type: "Convolution" convolution_param { num_output: 32 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer3-conv" top: "layer3-conv" name: "layer3-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer3-conv" top: "layer3-conv" name: "layer3-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer3-conv" top: "layer3-conv" name: "layer3-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer3-conv" top: "layer4-maxpool" name: "layer4-maxpool" type: "Pooling" pooling_param { stride: 2 pool: MAX kernel_size: 2 pad: 0 } } layer { bottom: "layer4-maxpool" top: "layer5-conv" name: "layer5-conv" type: "Convolution" convolution_param { num_output: 64 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer5-conv" top: "layer5-conv" name: "layer5-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer5-conv" top: "layer5-conv" name: "layer5-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer5-conv" top: "layer5-conv" name: "layer5-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer5-conv" top: "layer6-maxpool" name: "layer6-maxpool" type: "Pooling" pooling_param { stride: 2 pool: MAX kernel_size: 2 pad: 0 } } layer { bottom: "layer6-maxpool" top: "layer7-conv" name: "layer7-conv" type: "Convolution" convolution_param { num_output: 128 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer7-conv" top: "layer7-conv" name: "layer7-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer7-conv" top: "layer7-conv" name: "layer7-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer7-conv" top: "layer7-conv" name: "layer7-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer7-conv" top: "layer8-maxpool" name: "layer8-maxpool" type: "Pooling" pooling_param { stride: 2 pool: MAX kernel_size: 2 pad: 0 } } layer { bottom: "layer8-maxpool" top: "layer9-conv" name: "layer9-conv" type: "Convolution" convolution_param { num_output: 256 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer9-conv" top: "layer9-conv" name: "layer9-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer9-conv" top: "layer9-conv" name: "layer9-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer9-conv" top: "layer9-conv" name: "layer9-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer9-conv" top: "layer10-maxpool" name: "layer10-maxpool" type: "Pooling" pooling_param { stride: 2 pool: MAX kernel_size: 2 pad: 0 } } layer { bottom: "layer10-maxpool" top: "layer11-conv" name: "layer11-conv" type: "Convolution" convolution_param { num_output: 512 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer11-conv" top: "layer11-conv" name: "layer11-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer11-conv" top: "layer11-conv" name: "layer11-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer11-conv" top: "layer11-conv" name: "layer11-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer11-conv" top: "layer12-maxpool" name: "layer12-maxpool" type: "Pooling" pooling_param { stride: 1 pool: MAX kernel_size: 2 pad: 0 } } layer { bottom: "layer12-maxpool" top: "layer13-conv" name: "layer13-conv" type: "Convolution" convolution_param { num_output: 1024 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer13-conv" top: "layer13-conv" name: "layer13-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer13-conv" top: "layer13-conv" name: "layer13-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer13-conv" top: "layer13-conv" name: "layer13-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer13-conv" top: "layer14-conv" name: "layer14-conv" type: "Convolution" convolution_param { num_output: 256 kernel_size: 1 pad: 0 stride: 1 bias_term: false } } layer { bottom: "layer14-conv" top: "layer14-conv" name: "layer14-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer14-conv" top: "layer14-conv" name: "layer14-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer14-conv" top: "layer14-conv" name: "layer14-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer14-conv" top: "layer15-conv" name: "layer15-conv" type: "Convolution" convolution_param { num_output: 512 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer15-conv" top: "layer15-conv" name: "layer15-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer15-conv" top: "layer15-conv" name: "layer15-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer15-conv" top: "layer15-conv" name: "layer15-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer15-conv" top: "layer16-conv" name: "layer16-conv" type: "Convolution" convolution_param { num_output: 255 kernel_size: 1 pad: 0 stride: 1 bias_term: true } } layer { bottom: "layer14-conv" top: "layer18-route" name: "layer18-route" type: "Concat" } layer { bottom: "layer18-route" top: "layer19-conv" name: "layer19-conv" type: "Convolution" convolution_param { num_output: 128 kernel_size: 1 pad: 0 stride: 1 bias_term: false } } layer { bottom: "layer19-conv" top: "layer19-conv" name: "layer19-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer19-conv" top: "layer19-conv" name: "layer19-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer19-conv" top: "layer19-conv" name: "layer19-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer19-conv" top: "layer20-upsample" name: "layer20-upsample" type: "Upsample" #upsample_param { # scale: 2 #} } layer { bottom: "layer20-upsample" bottom: "layer9-conv" top: "layer21-route" name: "layer21-route" type: "Concat" } layer { bottom: "layer21-route" top: "layer22-conv" name: "layer22-conv" type: "Convolution" convolution_param { num_output: 256 kernel_size: 3 pad: 1 stride: 1 bias_term: false } } layer { bottom: "layer22-conv" top: "layer22-conv" name: "layer22-bn" type: "BatchNorm" batch_norm_param { use_global_stats: true } } layer { bottom: "layer22-conv" top: "layer22-conv" name: "layer22-scale" type: "Scale" scale_param { bias_term: true } } layer { bottom: "layer22-conv" top: "layer22-conv" name: "layer22-act" type: "ReLU" relu_param { negative_slope: 0.1 } } layer { bottom: "layer22-conv" top: "layer23-conv" name: "layer23-conv" type: "Convolution" convolution_param { num_output: 255 kernel_size: 1 pad: 0 stride: 1 bias_term: true } } layer { bottom: "layer16-conv" bottom: "layer23-conv" top: "yolo-det" name: "yolo-det" type: "Yolo" }
@cong235 Hello, My yolov3-tiny always reports an error: ####### input args####### C=3; H=416; W=416; caffemodel=./yolov3-tiny.caffemodel; calib=; cam=0; class=80; classname=coco.name; display=1; evallist=; input=0; inputstream=cam; mode=fp32; nms=0.450000; outputs=yolo-det; prototxt=./yolov3-tiny-trt.prototxt; savefile=result; saveimg=0; videofile=sample.mp4; ####### end args####### init plugin proto: ./yolov3-tiny-trt.prototxt caffemodel: ./yolov3-tiny.caffemodel Begin parsing model... ERROR: layer21-route: all concat input tensors must have the same dimensions except on the concatenation axis runYolov3: ./parserHelper.h:97: nvinfer1::DimsCHW parserhelper::getCHW(const nvinfer1::Dims&): Assertion `d.nbDims >= 3' failed. Aborted (core dumped)
Can you give me your yolov3-tiny?
I am having the same issue. I get the error: Begin parsing model... ERROR: layer21-route: all concat input tensors must have the same dimensions except on the concatenation axis (0), but dimensions mismatched at input 1 at index 1. Input 0 shape: [128,24,24], Input 1 shape: [256,26,26] runYolov3: ./parserHelper.h:97: nvinfer1::DimsCHW parserhelper::getCHW(const nvinfer1::Dims&): Assertion `d.nbDims >= 3' failed. Aborted (core dumped)
Can any one help me with this problem ??
@sandeepjangir07 You should try with 1 simple model you get from yolov3 homepage. And please confirm your yolov3-tiny-trt.prototxt was added new layer 'yolo'
@cong235 Hi, you mentioned that we can run int8 mode on tx2 and it's faster than fp16,However I set the mode to int8 and caliberate with 30 pictures, I got almost the same speed on xavier for yolo-tiny (about 85fps). I want to know why and is there a way to force the weights to int8 without caliberation? I can see that the int8 engine is just a little smaller than fp16 engine which means there's nearely no difference between my engines.
@zeyuDai2018 Yes, that is just in your engine, not another bro. Ex: You can processing 2 operation bellow: If you have 2 nodes with weight: 2.0 and 8.0 FP16: 2.0 * 8.0 = 16.0 INT8: 2 * 8 = 16 The result is the same.
But if your weight is 2.1 and 8.1: FP16: 2.1 * 8.1 = 17.01 INT8: 2 * 8 = 16 The result is difference.
I mean, INT8 is lighter than FP16 but it's not mean INT8 faster than FP16. Faster or not is dependent on your model and your weight.