vs-mlrt RIFE v2 model 4.7+ not working with static

with static_shape=False All Rife v1 works Rife v2 4.6- work

Rife v2 4.7+ not working

I tried to set workspace=1024 still did not work

Nov 18 '23 07:11 aloola18

Thanks, I can reproduce the problem.

log

[11/18/2023-16:04:51] [V] [TRT] --------------- Timing Runner: /encode/encode.6/ConvTranspose (CaskDeconvolution[0x8000000a])
[11/18/2023-16:04:51] [V] [TRT] CaskDeconvolution has no valid tactics for this config, skipping

It would be better if you could set verbose=True in backend.TRT() for a more detailed log.

I will go to check whether the problem is related to specific version of TensorRT now.

Nov 18 '23 08:11 WolframRhodium

here is my full logs trtexec_231118_155338.log

Nov 18 '23 08:11 aloola18

I have reported this issue to NVIDIA. Let's see how they reply.

Nov 18 '23 09:11 WolframRhodium

They said they're working on it.

TensorRT 9.2.0 released today still suffers from this problem.

Nov 28 '23 04:11 WolframRhodium

What difference would static_shape=False make? I've looked into the differences between static and dynamic shapes and I kind of get the idea. But I want to know practically speaking will it make any difference when I'm running these with SVP?

Dec 04 '23 16:12 netExtra

What difference would static_shape=False make? I've looked into the differences between static and dynamic shapes and I kind of get the idea. But I want to know practically speaking will it make any difference when I'm running these with SVP?

It simply means that you don't have to build an engine everytime you change resolution

Dec 04 '23 17:12 KLC04

As a sidenote. Rife v4.13 is re-released with a new architecture from hwzer, might be useful to re-export onnx to see if this issue may be fixed?

Dec 04 '23 18:12 KLC04

I have already implemented a fix in a similar way as the re-release. It does not change the model architecture but simply rename weights in the model. This is an issue of TensorRT rather than rife itself.

Dec 04 '23 22:12 WolframRhodium

TensorRT 9.3.0 released today still suffers from this problem.

Feb 01 '24 00:02 WolframRhodium

TensorRT 10.0.0 released today still suffers from this problem.

Mar 27 '24 04:03 WolframRhodium

onnx files remain unchanged.

For trt, you only need to update the files vstrt.dll and vsmlrt.py, and the whole folder vsmlrt-cuda.

Optionally, you can go to folders rife(_v2) and delete all .engine, .cacahe and .lock files, because engines for older version of trt cannot (by default) be used by newer version of trt.

Mar 27 '24 07:03 WolframRhodium

Optionally, you can go to folder models/rife(_v2) and delete all .engine, .cacahe and .lock file, because engines for older version of trt cannot (by default) be used by newer version of trt.

Thanks. this is the answer I was looking for. I remember deleting the engines for previous versions but I just wanted to be clear.

Mar 27 '24 07:03 netExtra

onnx files remain unchanged.

For trt, you only need to update the files vstrt.dll and vsmlrt.py, and the whole folder vsmlrt-cuda.

Optionally, you can go to folders rife(_v2) and delete all .engine, .cacahe and .lock files, because engines for older version of trt cannot (by default) be used by newer version of trt.

Apologies but do we know why Tensor 10.0 affects Rife so negatively?

Mar 27 '24 19:03 netExtra

I don't know.

Mar 27 '24 22:03 WolframRhodium

The original problem should be fixed in TensorRT 10.0.1.

On the other hand, I have not received a response for the performance regression bug report. I suspect that is due to a premature compiler optimization that offloads parts of the computational graph (related to /GridSample_3) to a worker stream and breaks operator fusion.

Apr 01 '24 03:04 WolframRhodium

RIFE v2 model 4.7+ not working with static_shape=False