TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

Support internlm2

Open RunningLeon opened this issue 10 months ago • 12 comments

This PR supports the conversion of internlm2 from hf to trt-llm checkpoints with :

  • fp16/bf16
  • tp

RunningLeon avatar Apr 02 '24 10:04 RunningLeon

image 做了相应修改后engine build不成功,请问可以怎样解决呢? 我的命令是: python3 convert_checkpoint.py --model_dir /mnt/checkpoint/models/internlm2-chat-20b/ --dtype float16 --output_dir /mnt/tllm_checkpoint/internlm2-chat-20b/tllm_checkpoint_1gpu_tp1

trtllm-build --checkpoint_dir /mnt/tllm_checkpoint/internlm2-chat-20b/tllm_checkpoint_1gpu_tp1/ --output_dir /mnt/trt_engines/internlm2-chat-20b/fp16/1-gpu/ --gemm_plugin float16 --max_batch_size=32 --max_input_len=2048 --max_output_len=2048

PaulX1029 avatar Apr 09 '24 10:04 PaulX1029

image 做了相应修改后engine build不成功,请问可以怎样解决呢? 我的命令是: python3 convert_checkpoint.py --model_dir /mnt/checkpoint/models/internlm2-chat-20b/ --dtype float16 --output_dir /mnt/tllm_checkpoint/internlm2-chat-20b/tllm_checkpoint_1gpu_tp1

trtllm-build --checkpoint_dir /mnt/tllm_checkpoint/internlm2-chat-20b/tllm_checkpoint_1gpu_tp1/ --output_dir /mnt/trt_engines/internlm2-chat-20b/fp16/1-gpu/ --gemm_plugin float16 --max_batch_size=32 --max_input_len=2048 --max_output_len=2048 版本信息: image

PaulX1029 avatar Apr 09 '24 10:04 PaulX1029

@PaulX1029 Hi, have you rebuilt and reinstalled tensorrt-llm? You can find the installation location by pip3 show tensorrt_llmand check tensorrt_llm/models/__init__.py to see if MODEL_MAP is as expected:

RunningLeon avatar Apr 10 '24 06:04 RunningLeon

@RunningLeon 请问您是采用什么build的方式,我是从pip安装的trtllm,我想要跟您对齐build方式重新进行

PaulX1029 avatar Apr 10 '24 08:04 PaulX1029

@RunningLeon 请问您是采用什么build的方式,我是从pip安装的trtllm,我想要跟您对齐build方式重新进行

https://github.com/NVIDIA/TensorRT-LLM/pull/266#issuecomment-2022311767

RunningLeon avatar Apr 10 '24 09:04 RunningLeon

@RunningLeon 很感谢你的工作!想问下internvl-1.5的internlm2-20b 网络跟普通internlm2-20b有什么区别吗,我用了PR里的转换脚本转出来之后都是乱码。

cqy930325 avatar Apr 24 '24 06:04 cqy930325

@RunningLeon hi,we use lora finetuned the internlm2 model. Now we can convert the base model to llama, but not lora part, we tried to change the code of InternLM/tools/convert2llame.py to transfer lora to llama style, but did not work. Is there any other tools could work for lora? tips: we can't export base model and lora to one single model, because we want to use https://github.com/NVIDIA/TensorRT-LLM/tree/main/examples/llama#run-llama-with-several-lora-checkpoints this feature. Thus, if we can transfer internlm2 lora to llama style, then we can use examples/hf_lora_convert.py to build trt-llm

ChengYouFancy avatar Apr 25 '24 09:04 ChengYouFancy

@nv-guomingz Hi, sorry to bother you, but when will this PR be merged? Do I need to fix the conflicts? THX

RunningLeon avatar May 08 '24 02:05 RunningLeon

@nv-guomingz hi, the conflicts with main branch are resolved. Looking forward to your review comments. THX.

RunningLeon avatar May 21 '24 07:05 RunningLeon

@RunningLeon 请问下为什么internlm2需要单独一个convert_checkpoint.py呢?而不是复用llama的convert_checkpoint.py,internlm是直接使用llama的convert_checkpoint.py

DefTruth avatar May 23 '24 10:05 DefTruth

@RunningLeon 请问下为什么internlm2需要单独一个convert_checkpoint.py呢?而不是复用llama的convert_checkpoint.py,internlm是直接使用llama的convert_checkpoint.py

hi, internlm2 W_qkv是在一起的,其次一些参数命名是和llama没有对齐的。因而没法直接使用llama的convert_checkpoint.py

RunningLeon avatar May 23 '24 11:05 RunningLeon

@RunningLeon 请问下为什么internlm2需要单独一个convert_checkpoint.py呢?而不是复用llama的convert_checkpoint.py,internlm是直接使用llama的convert_checkpoint.py

hi, internlm2 W_qkv是在一起的,其次一些参数命名是和llama没有对齐的。因而没法直接使用llama的convert_checkpoint.py

Thank you for this explanation!

DefTruth avatar May 24 '24 03:05 DefTruth

Hi @RunningLeon sorry for late response due to internal task priority. Would u please rebase the code firstly and I'll try to merge your MR into main branch this week.

nv-guomingz avatar May 28 '24 06:05 nv-guomingz

Hi @RunningLeon sorry for late response due to internal task priority. Would u please rebase the code firstly and I'll try to merge your MR into main branch this week.

@nv-guomingz Done. Hope merging with main is OK.

RunningLeon avatar May 28 '24 12:05 RunningLeon

Hi @RunningLeon sorry for late response due to internal task priority. Would u please rebase the code firstly and I'll try to merge your MR into main branch this week.

@nv-guomingz Done. Hope merging with main is OK.

Thanks @RunningLeon. Could u please rebase your commits into one single commit? That would be more easy for further integration.

nv-guomingz avatar May 29 '24 04:05 nv-guomingz

Hi @RunningLeon I've managed to file the merge request in our internal repo and testing is on-going. If everything goes well, this MR would be upstreamed next week. Thanks for your contributing again.

nv-guomingz avatar Jun 03 '24 08:06 nv-guomingz

@RunningLeon Internlm2 had been added into today's update. Please see notes here. https://github.com/NVIDIA/TensorRT-LLM/discussions/1726#discussion-6776859

nv-guomingz avatar Jun 04 '24 13:06 nv-guomingz