vs-mlrt icon indicating copy to clipboard operation
vs-mlrt copied to clipboard

RIFE v2 model 4.7+ not working with static_shape=False

Open aloola18 opened this issue 2 years ago • 15 comments

with static_shape=False All Rife v1 works Rife v2 4.6- work

Rife v2 4.7+ not working

I tried to set workspace=1024 still did not work

trtexec_231118_143522.log

aloola18 avatar Nov 18 '23 07:11 aloola18

Thanks, I can reproduce the problem.

log[11/18/2023-16:04:51] [V] [TRT] --------------- Timing Runner: /encode/encode.6/ConvTranspose (CaskDeconvolution[0x8000000a]) [11/18/2023-16:04:51] [V] [TRT] CaskDeconvolution has no valid tactics for this config, skipping

It would be better if you could set verbose=True in backend.TRT() for a more detailed log.

I will go to check whether the problem is related to specific version of TensorRT now.

WolframRhodium avatar Nov 18 '23 08:11 WolframRhodium

here is my full logs trtexec_231118_155338.log

aloola18 avatar Nov 18 '23 08:11 aloola18

I have reported this issue to NVIDIA. Let's see how they reply.

WolframRhodium avatar Nov 18 '23 09:11 WolframRhodium

They said they're working on it.

TensorRT 9.2.0 released today still suffers from this problem.

WolframRhodium avatar Nov 28 '23 04:11 WolframRhodium

What difference would static_shape=False make? I've looked into the differences between static and dynamic shapes and I kind of get the idea. But I want to know practically speaking will it make any difference when I'm running these with SVP?

netExtra avatar Dec 04 '23 16:12 netExtra

What difference would static_shape=False make? I've looked into the differences between static and dynamic shapes and I kind of get the idea. But I want to know practically speaking will it make any difference when I'm running these with SVP?

It simply means that you don't have to build an engine everytime you change resolution

KLC04 avatar Dec 04 '23 17:12 KLC04

As a sidenote. Rife v4.13 is re-released with a new architecture from hwzer, might be useful to re-export onnx to see if this issue may be fixed?

KLC04 avatar Dec 04 '23 18:12 KLC04

I have already implemented a fix in a similar way as the re-release. It does not change the model architecture but simply rename weights in the model. This is an issue of TensorRT rather than rife itself.

WolframRhodium avatar Dec 04 '23 22:12 WolframRhodium

TensorRT 9.3.0 released today still suffers from this problem.

WolframRhodium avatar Feb 01 '24 00:02 WolframRhodium

TensorRT 10.0.0 released today still suffers from this problem.

WolframRhodium avatar Mar 27 '24 04:03 WolframRhodium

onnx files remain unchanged.

For trt, you only need to update the files vstrt.dll and vsmlrt.py, and the whole folder vsmlrt-cuda.

Optionally, you can go to folders rife(_v2) and delete all .engine, .cacahe and .lock files, because engines for older version of trt cannot (by default) be used by newer version of trt.

WolframRhodium avatar Mar 27 '24 07:03 WolframRhodium

Optionally, you can go to folder models/rife(_v2) and delete all .engine, .cacahe and .lock file, because engines for older version of trt cannot (by default) be used by newer version of trt.

Thanks. this is the answer I was looking for. I remember deleting the engines for previous versions but I just wanted to be clear.

netExtra avatar Mar 27 '24 07:03 netExtra

onnx files remain unchanged.

For trt, you only need to update the files vstrt.dll and vsmlrt.py, and the whole folder vsmlrt-cuda.

Optionally, you can go to folders rife(_v2) and delete all .engine, .cacahe and .lock files, because engines for older version of trt cannot (by default) be used by newer version of trt.

Apologies but do we know why Tensor 10.0 affects Rife so negatively?

netExtra avatar Mar 27 '24 19:03 netExtra

I don't know.

WolframRhodium avatar Mar 27 '24 22:03 WolframRhodium

The original problem should be fixed in TensorRT 10.0.1.

On the other hand, I have not received a response for the performance regression bug report. I suspect that is due to a premature compiler optimization that offloads parts of the computational graph (related to /GridSample_3) to a worker stream and breaks operator fusion.

WolframRhodium avatar Apr 01 '24 03:04 WolframRhodium