TensorRT-LLM
TensorRT-LLM copied to clipboard
CogVLM fails to build after convert
System Info
Version: 0.11
Model: THUDM/cogvlm-chat-hf
I'm trying to build CogVLM with the provided example, but it fails with an error.
Who can help?
@byshiue
Information
- [x] The official example scripts
- [ ] My own modified scripts
Tasks
- [x] An officially supported task in the
examplesfolder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
In examples/cogvlm
python convert_checkpoint.py --model_dir $HUGGINGFACE_HUB_CACHE/models--THUDM--cogvlm-chat-hf/snapshots/e29dc3ba206d524bf8efbfc60d80fc4556ab0e3c \
--output_dir ./tllm_checkpoint_1gpu_fp16_cogvlm1 \
--dtype float16
Which succeeds. But then the build command:
trtllm-build --checkpoint_dir ./tllm_checkpoint_1gpu_fp16_cogvlm1 \
--output_dir ./tmp/cogvlm1/17B/trt_engines/fp16/1-gpu \
--gemm_plugin bfloat16 \
--gpt_attention_plugin bfloat16 \
--context_fmha_fp32_acc enable \
--remove_input_padding disable \
--max_batch_size 48 \
--max_input_len 2048 \
--max_seq_len 3076 \
--paged_kv_cache disable \
--use_custom_all_reduce disable \
--enable_xqa disable \
--bert_attention_plugin disable \
--moe_plugin disable \
--max_multimodal_len 61440
Fails with the following
File "<redacted>/tensorrt_llm/module.py", line 40, in __call__
output = self.forward(*args, **kwargs)
TypeError: PromptTuningEmbedding.forward() missing 3 required positional arguments: 'prompt_embedding_table', 'tasks', and 'task_vocab_size'
Expected behavior
Successfully builds and runs
actual behavior
error
additional notes
N/A