CPU：Intel Xeon w7-3565X (64) @ 4.80 GHz GPU：NVIDIA GeForce RTX 5090 32G 内存：DDR5-5600 ECC Registered 64G × 16 = 1024G 系统：Ubuntu 22.04.4 LTS x86_64 KTranformers 版本：v0.3.2 CUDA 版本：12.9 运行 DeepSeek-V2-Lite-Chat 模型：

python -m ktransformers.local_chat --model_path deepseek-ai/DeepSeek-V2-Lite-Chat --gguf_path ./DeepSeek-V2-Lite-Chat-GGUF

Hugging Face transformers 的不同版本似乎都无法支持 Kt 运行：

transformers = 4.43.2

ImportError: cannot import name 'FlashAttentionKwargs' from 'transformers.modeling_flash_attention_utils' (/home/xiejianan/miniconda3/envs/ktransformers/lib/python3.11/site-packages/transformers/modeling_flash_attention_utils.py)

transformers = 4.56.1

ValueError: You should provide exactly one of layers or layer_class_to_replicate to initialize a Cache.

求问在 RTX50 系显卡上（CUDA版本大于12.8）如何部署 Ktransformers？

Sep 18 '25 11:09 xjn-La-La-land

我也有相同的问题，能否给个5090的部署方案？

Sep 21 '25 17:09 jyizheng

相同的问题，transformers==4.56.2 报错

ValueError: You should provide exactly one of `layers` or `layer_class_to_replicate` to initialize a Cache.

Sep 24 '25 07:09 Yuan-Allen

Successful RTX 5090 Installation Guide (Step 6) Hardware Specs:

🖥️ Ryzen 9 9950X3D

🧠 DDR5-5600 256GB

🎮 RTX 5090

🚀 Samsung 990PRO 2TB + 2TB

Final Environment Status:

text

🎯 Environment Check Successful

PyTorch: 2.8.0+cu128 Transformers: 4.56.1 KTransformers: 0.3.2 CUDA available: True GPU: NVIDIA GeForce RTX 5090 Step 6: KTransformers Installation (Critical Steps) bash

Clone repository

cd ~ git clone https://github.com/KVCache-AI/KTransformers.git cd KTransformers

Submodule initialization (IMPORTANT!)

git submodule init git submodule update

Installation

make dev_install

Verification

python -c "import ktransformers; print('Installation successful!')" Key Success Factors:

✅ Submodule initialization is mandatory

✅ PyTorch 2.8+ with CUDA 12.8 compatibility

✅ make dev_install for complete setup

✅ RTX 5090 works perfectly with KTransformers

Hope this helps others struggling with RTX 50 series setup! 🚀

Sep 24 '25 19:09 RFharaga

Can you run Qwen3-235B model?

Sep 30 '25 18:09 jyizheng

5080也是 ValueError: You should provide exactly one of layers or layer_class_to_replicate to initialize a Cache. 这个问题，transformers=4.57.0，ktransformers=0.3.2+cu128torch28avx2，cuda12.8，torch2.8.0 尝试了4060ti也是这个问题，有人知道如何解决吗

Oct 14 '25 04:10 zhizi42

研究了几天发现其实很简单，在开始git clone的时候改为clone指定版本就行了 git clone --branch v0.3.2 https://github.com/kvcache-ai/ktransformers.git

Oct 18 '25 05:10 zhizi42

研究了几天发现其实很简单，在开始git clone的时候改为clone指定版本就行了 git clone --branch v0.3.2 https://github.com/kvcache-ai/ktransformers.git

请问能通过它项目中的dockerfile来构建可用的docker吗？那个文件中的哪些位置需要修改来适配5090？

Oct 26 '25 01:10 cqllzp

RTX50系显卡上如何部署？

text

🎯 Environment Check Successful

Clone repository

Submodule initialization (IMPORTANT!)

Installation

Verification