FeiYull
FeiYull
@Bruce-WangGF 多batch仅支持视频和摄像头
@qqqe2 如果你的图像是类似灰度图的单通道图像,此时是不用的
@ningjianfeng install opencv: `sudo apt-get install libopencv-dev `
@Ichiruchan 把越界时,那几个变量全部打印出来,同时把图片缓存下,单独调试,很可能是前面的问题导致m_output_objects_host值异常
@KARAS1998 将tensorrt 安装目录下common/sample里面的sampleUtils.cpp文件添加到项目中,对应的头文件也包含到项目属性里面
@Yi-hash1 导出的onnx是静态输入,可参考仓库中导出onnx的指令
@JinRanYAO Is the data you're testing a picture or a video?
@JinRanYAO Try to use the following instructions to achieve fp16 quantization, and improve performance by about 100% ./trtexec --onnx=yolov8n-pose.onnx --saveEngine=yolov8n-pose-fp16.trt --buildOnly --minShapes=images:1x3x640x640 --optShapes=images:2x3x640x640 --maxShapes=images:4x3x640x640 --fp16 【FP32】: [04/07/2024-09:15:16] [I] preprocess time...
@JinRanYAO It is recommended to enter the function YOLOv8Pose::preprocess to test the internal time overhead. [void YOLOv8Pose::preprocess(const std::vector& imgsBatch)](https://github.com/FeiYull/TensorRT-Alpha/blob/bca9575229ef5f6fe4c5acf51c1bd3c7e5959ec6/yolov8-pose/yolov8_pose.cpp#L103)
@JinRanYAO U can merge the following operations to one: 1. resizeDevice 2. bgr2rgbDevice 3. normDevice Inside the resizeDevice's cuda kernel function you call, modify the following: [modify bofore] https://github.com/FeiYull/TensorRT-Alpha/blob/bca9575229ef5f6fe4c5acf51c1bd3c7e5959ec6/utils/kernel_function.cu#L142 [modify...