KarlDe1
KarlDe1
I'm currently using the Qwen2.5-VL model in a single-node, single-process environment, with 8 H20 GPUs on one machine. I want to deploy the model on Triton, with each GPU loading...
seek for help @deadeyegoodwin @GuanLuo @tanmayv25, thanks!!!
Hope the developers can respond actively.
Hi @poweiw, My model contains multiple usages of my custom plugin, but all shapes in the model are fixed. At the moment, I am not sure what could be causing...
[related_issue](https://github.com/NVIDIA/cudnn-frontend/issues/189)
@poweiw Thank for your reply. I have two questions as follows: #### (1) I would like to clarify what **“changed”** means here. Does it mean that `configurePlugin()` will be triggered...