Same GPU build, same files, but got the error: The engine plan file is generated on an incompatible device, expecting compute 9.0 got compute 8.9, please rebuild.
System Info
Running an an H200, engine was built on an H200 as outlined by the config.json file. When deploying to H200, I received this error: The engine plan file is generated on an incompatible device, expecting compute 9.0 got compute 8.9, please rebuild.
Who can help?
No response
Information
- [ ] The official example scripts
- [ ] My own modified scripts
Tasks
- [ ] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
Install TensorRT-LLM and follow quick guide setup compile rank.engine files deploy with triton
Expected behavior
Expected behavior was for the engine file to run since it was built on the same exact GPU architecture
actual behavior
I receive this error: The engine plan file is generated on an incompatible device, expecting compute 9.0 got compute 8.9, please rebuild.
additional notes
Not sure the exact issue around this, worked fine on the engine build step so it's odd to see a mistmatch.
@JoJoLev
Hi, can you share the concrete reproducing steps to reproduce the issue?
Thanks June