DeepSeekV3.2 乱码
Reminder
- [x] I have read the above rules and searched the existing issues.
System Info
AMX_METHOD=AMXINT8 SGLANG_ENABLE_JIT_DEEPGEMM=0 PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True python -m sglang.launch_server --host 0.0.0.0 --port 60000 --model /mnt/ktd/v3.2exp/ --kt-weight-path ~/v3.2cpu/ --kt-cpuinfer 116 --kt-threadpool-count 2 --kt-num-gpu-experts 5 --kt-method AMXINT8 --attention-backend flashinfer --trust-remote-code --mem-fraction-static 0.5 --chunked-prefill-size 1024 --max-running-requests 4 --enable-mixed-chunk --tensor-parallel-size 1 --enable-p2p-check --disable-shared-experts-fusion --served-model-name kimi_k2 --tool-call-parser deepseekv3 --kt-max-deferred-experts-per-token 7
运行命令是这个。
今天发布了DeepSeek-V3.2正式版和DeepSeek-V3.2-Speciale,用ktransformer 应该怎样跑(指令是什么, 我的是XEON5+768GB+双4090)
https://github.com/kvcache-ai/ktransformers/blob/main/doc/en/kt-kernel/deepseek-v3.2-sglang-tutorial.md