TensorRT-LLM icon indicating copy to clipboard operation
TensorRT-LLM copied to clipboard

CogVLM fails to build after convert

Open sxyu opened this issue 1 year ago • 0 comments

System Info

Version: 0.11 Model: THUDM/cogvlm-chat-hf I'm trying to build CogVLM with the provided example, but it fails with an error.

Who can help?

@byshiue

Information

  • [x] The official example scripts
  • [ ] My own modified scripts

Tasks

  • [x] An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • [ ] My own task or dataset (give details below)

Reproduction

In examples/cogvlm

python convert_checkpoint.py --model_dir $HUGGINGFACE_HUB_CACHE/models--THUDM--cogvlm-chat-hf/snapshots/e29dc3ba206d524bf8efbfc60d80fc4556ab0e3c  \
                                                                                        --output_dir ./tllm_checkpoint_1gpu_fp16_cogvlm1 \
                                                                                        --dtype float16

Which succeeds. But then the build command:


trtllm-build --checkpoint_dir ./tllm_checkpoint_1gpu_fp16_cogvlm1 \  
                                                           --output_dir ./tmp/cogvlm1/17B/trt_engines/fp16/1-gpu \
                                                          --gemm_plugin bfloat16 \
                                                          --gpt_attention_plugin bfloat16 \
                                                          --context_fmha_fp32_acc enable \
                                                          --remove_input_padding disable \
                                                          --max_batch_size 48 \
                                                          --max_input_len 2048 \
                                                          --max_seq_len 3076 \
                                                          --paged_kv_cache disable \
                                                          --use_custom_all_reduce disable \
                                                          --enable_xqa disable \
                                                          --bert_attention_plugin disable \
                                                          --moe_plugin disable \
                                                          --max_multimodal_len 61440

Fails with the following

 File "<redacted>/tensorrt_llm/module.py", line 40, in __call__                                                                               
    output = self.forward(*args, **kwargs)                                                                                                                                                                    
TypeError: PromptTuningEmbedding.forward() missing 3 required positional arguments: 'prompt_embedding_table', 'tasks', and 'task_vocab_size'

Expected behavior

Successfully builds and runs

actual behavior

error

additional notes

N/A

sxyu avatar Jul 19 '24 19:07 sxyu