Tengine
Tengine copied to clipboard
opencl backend not work on platform NXP i.mx8 qxp(with gpu vivantte GC7000lite)
Commandline:
/usr/local/tengine_ocl/bin/tm_retinaface -m /usr/local/tengine/models/retinaface.tmfile -i /usr/local/tengine/images/mtcnn_face4.jpg -r 5 -t 1
log output:
tengine-lite library version: 1.5-dev
data out rf_c3_lateral_relu -nan
data out mobilenet0_relu22_fwd 3.39615e+38
data out mobilenet0_relu10_fwd 0.14199
data out rf_c2_aggr_relu -nan
data out face_rpn_cls_score_reshape_stride8 -nan
data out face_rpn_cls_score_reshape_stride16 -nan
data out face_rpn_cls_score_reshape_stride32 -nan
data out rf_c1_det_concat_relu -nan
data out rf_c2_det_concat_relu -nan
data out rf_c3_det_concat_relu -nan
data out face_rpn_landmark_pred_stride8 -nan
data out face_rpn_bbox_pred_stride8 -nan
data out face_rpn_cls_prob_reshape_stride8 -nan
data out face_rpn_landmark_pred_stride16 -nan
data out face_rpn_bbox_pred_stride16 -nan
data out face_rpn_cls_prob_reshape_stride16 -nan
data out face_rpn_landmark_pred_stride32 -nan
data out face_rpn_bbox_pred_stride32 -nan
data out face_rpn_cls_prob_reshape_stride32 -nan
data out rf_c3_lateral_relu -nan
data out mobilenet0_relu22_fwd 3.39615e+38
data out mobilenet0_relu10_fwd 0.156688
data out rf_c2_aggr_relu -nan
data out face_rpn_cls_score_reshape_stride8 -nan
data out face_rpn_cls_score_reshape_stride16 -nan
data out face_rpn_cls_score_reshape_stride32 -nan
data out rf_c1_det_concat_relu -nan
data out rf_c2_det_concat_relu -nan
data out rf_c3_det_concat_relu -nan
data out face_rpn_landmark_pred_stride8 -nan
data out face_rpn_bbox_pred_stride8 -nan
data out face_rpn_cls_prob_reshape_stride8 -nan
data out face_rpn_landmark_pred_stride16 -nan
data out face_rpn_bbox_pred_stride16 -nan
data out face_rpn_cls_prob_reshape_stride16 -nan
data out face_rpn_landmark_pred_stride32 -nan
data out face_rpn_bbox_pred_stride32 -nan
data out face_rpn_cls_prob_reshape_stride32 -nan
data out rf_c3_lateral_relu -nan
data out mobilenet0_relu22_fwd 3.39615e+38
data out mobilenet0_relu10_fwd 0.171386
data out rf_c2_aggr_relu -nan
data out face_rpn_cls_score_reshape_stride8 -nan
data out face_rpn_cls_score_reshape_stride16 -nan
data out face_rpn_cls_score_reshape_stride32 -nan
data out rf_c1_det_concat_relu -nan
data out rf_c2_det_concat_relu -nan
data out rf_c3_det_concat_relu -nan
data out face_rpn_landmark_pred_stride8 -nan
data out face_rpn_bbox_pred_stride8 -nan
data out face_rpn_cls_prob_reshape_stride8 -nan
data out face_rpn_landmark_pred_stride16 -nan
data out face_rpn_bbox_pred_stride16 -nan
data out face_rpn_cls_prob_reshape_stride16 -nan
data out face_rpn_landmark_pred_stride32 -nan
data out face_rpn_bbox_pred_stride32 -nan
data out face_rpn_cls_prob_reshape_stride32 -nan
data out rf_c3_lateral_relu -nan
data out mobilenet0_relu22_fwd 3.39615e+38
data out mobilenet0_relu10_fwd 0.186084
data out rf_c2_aggr_relu -nan
data out face_rpn_cls_score_reshape_stride8 -nan
data out face_rpn_cls_score_reshape_stride16 -nan
data out face_rpn_cls_score_reshape_stride32 -nan
data out rf_c1_det_concat_relu -nan
data out rf_c2_det_concat_relu -nan
data out rf_c3_det_concat_relu -nan
data out face_rpn_landmark_pred_stride8 -nan
data out face_rpn_bbox_pred_stride8 -nan
data out face_rpn_cls_prob_reshape_stride8 -nan
data out face_rpn_landmark_pred_stride16 -nan
data out face_rpn_bbox_pred_stride16 -nan
data out face_rpn_cls_prob_reshape_stride16 -nan
data out face_rpn_landmark_pred_stride32 -nan
data out face_rpn_bbox_pred_stride32 -nan
data out face_rpn_cls_prob_reshape_stride32 -nan
data out rf_c3_lateral_relu -nan
data out mobilenet0_relu22_fwd 3.39615e+38
data out mobilenet0_relu10_fwd 0.200783
data out rf_c2_aggr_relu -nan
data out face_rpn_cls_score_reshape_stride8 -nan
data out face_rpn_cls_score_reshape_stride16 -nan
data out face_rpn_cls_score_reshape_stride32 -nan
data out rf_c1_det_concat_relu -nan
data out rf_c2_det_concat_relu -nan
data out rf_c3_det_concat_relu -nan
data out face_rpn_landmark_pred_stride8 -nan
data out face_rpn_bbox_pred_stride8 -nan
data out face_rpn_cls_prob_reshape_stride8 -nan
data out face_rpn_landmark_pred_stride16 -nan
data out face_rpn_bbox_pred_stride16 -nan
data out face_rpn_cls_prob_reshape_stride16 -nan
data out face_rpn_landmark_pred_stride32 -nan
data out face_rpn_bbox_pred_stride32 -nan
data out face_rpn_cls_prob_reshape_stride32 -nan
img_h, img_w : 316, 474
Repeat 5 times, thread 1, avg time 663.77 ms, max_time 669.14 ms, min_time 662.25 ms
--------------------------------------
detected face num: 0
Segmentation fault (core dumped)
该问题解决 主要是work_item_max的限制