TensorRT-LLM
TensorRT-LLM copied to clipboard
when will the high-level-api support qwen model?
when i'm running following script:
python3 llm_examples.py --task run_llm_from_huggingface_model \ --prompt="你是谁?" \ --tp_size=8 \ --dump_engine_dir /docker_storage/trtModels/fp16/8-gpu/Qwen1.5-72B-Chat \ --hf_model_dir=/docker_storage/Qwen1.5-72B-Chat
KeyError: 'Unsupported model architecture: Qwen2ForCausalLM, only LlamaForCausalLM, MixtralForCausalLM are supported now.'