TensorRT-LLM
TensorRT-LLM copied to clipboard
There are differences in the results of Qwen2-7B Instruction
System Info
GPU:L20 Tensorrt-LLM:v0.11.0 transformers: 4.42.0
Who can help?
@ncomly-nvidia @kaiyux prompt='你好,请介绍一下喜马拉雅山的详细信息'
1、transformers
about params: generation_config = GenerationConfig( top_k=1, temperature=1, max_length=2048, max_new_tokens=80, repetition_penalty=1.0, early_stopping=True, do_sample=True, num_beams=1, top_p=1, pad_token_id=tokenizer.pad_token_id, eos_token_id=tokenizer.eos_token_id ) transformers result: ` 喜马拉雅山(Himalayas)是地球上最高的山脉,位于亚洲南部,横跨中国、印度、尼泊尔、不丹、巴基斯坦和阿富汗等国家。以下是关于喜马拉雅山的一些详细信息:
地理位置与范围
喜马拉雅山脉从中国西藏的喜马拉雅山脉开始,向南延伸至印度的喜马拉雅山脉,, 128 `
2、Tensorrt-LLM
about params:
batch_input_ids=input_ids, max_new_tokens=80, end_id=tokenizer.eos_token_id, pad_id=tokenizer.pad_token_id, top_k=1
Tensorrt-LLM result:
`
你好!喜马拉雅山(Himalayas)是地球上最壮观的山脉之一,位于亚洲南部,横跨中国、印度、尼泊尔、不丹、巴基斯坦和阿富汗等国家。以下是关于喜马拉雅山的一些详细信息:
地理位置与范围
喜马拉雅山脉从中国西藏的喜马拉雅山脉开始,向南延伸至印度的 `
3、how to create input_ids?
` prompt='你好,请介绍一下喜马拉雅山的详细信息' messages = [{"role": "user", "content": prompt}]
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) input_ids = tokenizer(prompt, truncation=True, return_tensors="pt", add_special_tokens=False)['input_ids'] `
4、build Qwen2-7B engine
`
python convert_checkpoint.py --model_dir /mnt/qwen2/Qwen2-7B-Instruct
--output_dir checkpoint
--dtype float16
trtllm-build --checkpoint_dir ./checkpoint
--output_dir ./fp16
--gemm_plugin float16
`
Information
- [X] The official example scripts
- [ ] My own modified scripts
Tasks
- [X] An officially supported task in the
examples
folder (such as GLUE/SQuAD, ...) - [ ] My own task or dataset (give details below)
Reproduction
- Test transformers and Tensorrt-LLM results separately using the same input
- Comparing the generated token and prompt will reveal differences
Expected behavior
1、I hope qwen2 can be perfectly aligned
actual behavior
1、There are some differences in the results 2、Tested many cases, with approximately 5-10% not fully aligned
additional notes
Nothing