TensorRT-LLM
TensorRT-LLM copied to clipboard
DeepSeek MoE support
This PR adds support for DeepSeek MoE https://huggingface.co/deepseek-ai/deepseek-moe-16b-base
Main differences from Mixtral:
- Shared experts
- First layers are dense
- MoE normalization disabled
Build:
cd TensorRT-LLM/examples/llama
python convert_checkpoint.py --model_dir /models/deepseek-moe-16b-base/ --dtype float16 --output_dir /trtllm/deepseek-moe-16b-base/1-gpu-tmp/
trtllm-build --checkpoint_dir /trtllm/deepseek-moe-16b-base/1-gpu-tmp/ --output_dir /trtllm/deepseek-moe-16b-base/1-gpu --max_batch_size 32 --max_input_len 3072 --max_output_len 1024 --max_num_tokens 32768 --gpt_attention_plugin float16 --gemm_plugin float16 --context_fmha enable --paged_kv_cache enable --remove_input_padding enable --use_paged_context_fmha enable
Run:
cd TensorRT-LLM/examples/
python run.py --engine_dir /trtllm/deepseek-moe-16b-base/1-gpu --tokenizer_dir /models/deepseek-moe-16b-base/ --max_output_len 32 --top_p 0 --input_text
"The president of the United States is person who"
TensorRt-LLM Output:
[TensorRT-LLM] TensorRT-LLM version: 0.11.0.dev2024060400
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Input [Text 0]: "<|begin▁of▁sentence|>The president of the United States is person who"
Output [Text 0 Beam 0]: " is elected by the people of the United States to lead the country. The president is the head of the executive branch of the government. The president is the commander"
Transformers Output:
>>> tokenizer.batch_decode(model.generate(torch.LongTensor([tokenizer.encode("The president of the United States is person who")]).cuda(), max_new_tokens=32, do_sample=False))
The attention mask and the pad token id were not set. As a consequence, you may observe unexpected behavior. Please pass your input's `attention_mask` to obtain reliable results.
Setting `pad_token_id` to `eos_token_id`:100001 for open-end generation.
['<|begin▁of▁sentence|>The president of the United States is person who is elected by the people of the United States to lead the country. The president is the head of the executive branch of the government. The president is the commander']