Khaleesi
Khaleesi
> 发现一个问题,最开始几轮对话比较慢,多问几次速度就会逐渐提升... 需要warm up呀
> 我没搞懂放什么 这是我的命令你可以参考一下: python -m ktransformers.local_chat --model_path /workspace/Deepseek-models/DeepSeek-R1 --gguf_path /workspace/Deepseek-models/DeepSeek-R1-Q4_K_M --cpu_infer 65 --max_new_tokens 1000 root@lts-4090:/workspace/ktransformers# ls /workspace/Deepseek-models/DeepSeek-R1 config.json configuration_deepseek.py tokenizer.json configuration.json generation_config.json tokenizer_config.json root@lts-4090:/workspace/ktransformers# ls /workspace/Deepseek-models/DeepSeek-R1-Q4_K_M DeepSeek-R1-Q4_K_M-00001-of-00009.gguf DeepSeek-R1-Q4_K_M-00006-of-00009.gguf DeepSeek-R1-Q4_K_M-00002-of-00009.gguf DeepSeek-R1-Q4_K_M-00007-of-00009.gguf...