RuntimeError: Number of IO tensors is not correct, must be 116, but you have 115 tensors
我按照您的指导,逐步执行了cpu版本上转换tensorRT的步骤,一切都很顺利。但是当我运行demo.py时,却出现了以下的错误,请问是什么原因呢?
(tensorRT) user@lsp-ws:~/data/ChatGLM2-6B-TensorRT$ python demo.py
<module 'ckernel' from '/home/user/.cache/torch_extensions/py310_cu118/ckernel/ckernel.so'>
<class 'ckernel.Kernel'>
<instancemethod forward at 0x7f14fecfb8b0>
INFO: Loaded engine size: 11916 MiB
INFO: [MemUsageChange] TensorRT-managed allocation in engine deserialization: CPU +0, GPU +11912, now: CPU 0, GPU 11912 (MiB)
Traceback (most recent call last):
File "/home/user/data/ChatGLM2-6B-TensorRT/demo.py", line 222, in
demo.py暂时没适配,目前trt-llm已经在内测了,预计下个月发布,所以要不再等等?
好的,谢谢您!
不客气