RTX50系显卡上如何部署?
CPU:Intel Xeon w7-3565X (64) @ 4.80 GHz GPU:NVIDIA GeForce RTX 5090 32G 内存:DDR5-5600 ECC Registered 64G × 16 = 1024G 系统:Ubuntu 22.04.4 LTS x86_64 KTranformers 版本:v0.3.2 CUDA 版本:12.9 运行 DeepSeek-V2-Lite-Chat 模型:
python -m ktransformers.local_chat --model_path deepseek-ai/DeepSeek-V2-Lite-Chat --gguf_path ./DeepSeek-V2-Lite-Chat-GGUF
Hugging Face transformers 的不同版本似乎都无法支持 Kt 运行:
- transformers = 4.43.2
ImportError: cannot import name 'FlashAttentionKwargs' from 'transformers.modeling_flash_attention_utils' (/home/xiejianan/miniconda3/envs/ktransformers/lib/python3.11/site-packages/transformers/modeling_flash_attention_utils.py)
- transformers = 4.56.1
ValueError: You should provide exactly one of layers or layer_class_to_replicate to initialize a Cache.
求问在 RTX50 系显卡上(CUDA版本大于12.8)如何部署 Ktransformers?
我也有相同的问题,能否给个5090的部署方案?
相同的问题,transformers==4.56.2
报错
ValueError: You should provide exactly one of `layers` or `layer_class_to_replicate` to initialize a Cache.
Successful RTX 5090 Installation Guide (Step 6) Hardware Specs:
🖥️ Ryzen 9 9950X3D
🧠 DDR5-5600 256GB
🎮 RTX 5090
🚀 Samsung 990PRO 2TB + 2TB
Final Environment Status:
text
🎯 Environment Check Successful
PyTorch: 2.8.0+cu128 Transformers: 4.56.1 KTransformers: 0.3.2 CUDA available: True GPU: NVIDIA GeForce RTX 5090 Step 6: KTransformers Installation (Critical Steps) bash
Clone repository
cd ~ git clone https://github.com/KVCache-AI/KTransformers.git cd KTransformers
Submodule initialization (IMPORTANT!)
git submodule init git submodule update
Installation
make dev_install
Verification
python -c "import ktransformers; print('Installation successful!')" Key Success Factors:
✅ Submodule initialization is mandatory
✅ PyTorch 2.8+ with CUDA 12.8 compatibility
✅ make dev_install for complete setup
✅ RTX 5090 works perfectly with KTransformers
Hope this helps others struggling with RTX 50 series setup! 🚀
Can you run Qwen3-235B model?
5080也是 ValueError: You should provide exactly one of layers or layer_class_to_replicate to initialize a Cache. 这个问题,transformers=4.57.0,ktransformers=0.3.2+cu128torch28avx2,cuda12.8,torch2.8.0 尝试了4060ti也是这个问题,有人知道如何解决吗
研究了几天发现其实很简单,在开始git clone的时候改为clone指定版本就行了
git clone --branch v0.3.2 https://github.com/kvcache-ai/ktransformers.git
研究了几天发现其实很简单,在开始git clone的时候改为clone指定版本就行了
git clone --branch v0.3.2 https://github.com/kvcache-ai/ktransformers.git
请问能通过它项目中的dockerfile来构建可用的docker吗?那个文件中的哪些位置需要修改来适配5090?