FastDeploy
FastDeploy copied to clipboard
[Backend & Serving] Serving and Runtime support Clone
PR types(PR类型)
Backend Serving
Describe
- OpenVINO and Option support num_streams
- Runtime(PaddleBackend and OpenVINOBackend) support Clone , TRT and ORT wait for the next PR
- Serving create multiple Instances use Runtime*->Clone
Multiple Runtime Clone Example:
fastdeploy::Runtime runtime;
runtime.Init(runtime_option);
fastdeploy::Runtime runtime2;
runtime2.Init(runtime_option2);
Multiple Runtime NO Clone Example:
fastdeploy::Runtime runtime;
runtime.Init(runtime_option);
auto* runtime2 = runtime.Clone();
Clone方式可以减少内存消耗: 模型: ResNet50_vd 配置: 开启CPU、4线程
实例数 | Clone模式 | 不Clone模式 |
---|---|---|
1 | 301M | 301M |
2 | 301M | 424M |
3 | 301M | 646M |
OpenVINO使用Clone+设置num_stream(参数实例数量一致) 可以显著提升性能:
并发数 | OpenVINO(优化后) | OpenVINO(优化前) | 对比 |
---|---|---|---|
2实例 2streams设置16线程 | 2实例设置16线程 | ||
1 | 79.622 ms | 77.828 ms | -2.3% |
3 | 78.774 ms | 99.420 ms | 20.76% |
5 | 105.108 ms | 160.446 ms | 30.49% |
TRT Backend 模型: Yolov5 配置: 开启GPU TRT 2个实例
资源消耗 | Clone模式(同一张卡) | 不Clone模式 |
---|---|---|
内存 | 1.6G | 1.7G |
显存 | 663M | 981M |