vs-realesrgan performance so much worse then using trt.Model

performance so much worse then using trt.Model

Open efschu opened this issue 2 months ago • 0 comments

Hi,

I just compared

clip = core.resize.Bicubic(clip, width=720, height=574, format=vs.RGBS, matrix_in_s='709')
clip = core.trt.Model(clip, engine_path="realesr-general-wdn-x4v3_opset16_574x720_fp16.engine", num_streams=4, device_id=0)

created with

trtexec --fp16 --onnx=./realesr-general-wdn-x4v3_opset16.onnx --minShapes=input:1x3x8x8 --optShapes=input:1x3x574x720 --maxShapes=input:1x3x574x720 --saveEngine=./realesr-general-wdn-x4v3_opset16_574x720_fp16.engine --tacticSources=+CUDNN,-CUBLAS,-CUBLAS_LT --skipInference --useCudaGraph --noDataTransfers --builderOptimizationLevel=5 --infStreams=4

gives me:

vspipe -e 300 test3.vpy -p .
Script evaluation done in 2.26 seconds
Output 301 frames in 8.15 seconds (36.92 fps)

using vs-realesrgan instead:

clip = core.resize.Bicubic(clip, width=720, height=574, format=vs.RGBH, matrix_in_s='709')
clip = realesrgan(clip, num_streams = 4, trt = True, model = 5, denoise_strength = 0)

gives me:

vspipe -e 300 test3.vpy -p .
Script evaluation done in 2.26 seconds
Output 601 frames in 23.73 seconds (25.33 fps)

How come that using core.trt.Model is almost 50% faster?

Apr 19 '24 10:04 efschu

vs-realesrgan vs-realesrgan copied to clipboard

performance so much worse then using trt.Model

vs-realesrgan
vs-realesrgan copied to clipboard