ktransformers [Bug] whl安装包的cuda版本依赖问题

Checklist

[x] 1. I have searched related issues but cannot get the expected help.
[x] 2. The bug has not been fixed in the latest version.
[ ] 3. Please note that if the bug-related issue you submitted lacks corresponding environment info and a minimal reproducible demo, it will be challenging for us to reproduce and resolve the issue, reducing the likelihood of receiving feedback.
[ ] 4. If the issue you raised is not a bug but a question, please raise a discussion at https://github.com/kvcache-ai/ktransformers/discussions. Otherwise, it will be closed.
[ ] 5. To help the community, I will use Chinese/English or attach an Chinese/English translation if using another language. Non-Chinese/English content without translation may be closed.

Describe the bug

按照 https://github.com/kvcache-ai/ktransformers/blob/main/KT-SFT/README.md#quick-to-start 这个指导步骤安装ktransformers和LLaMa-Factory，安装完后开启微调，运行 USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml 报错如下：

no balance_serve
flashinfer not found, use triton for linux
Traceback (most recent call last):
  File "/root/miniconda3/envs/kllama/bin/llamafactory-cli", line 7, in <module>
    sys.exit(main())
             ^^^^^^
  File "/home/ictrek/xuejin/LLaMA-Factory/src/llamafactory/cli.py", line 24, in main
    launcher.launch()
  File "/home/ictrek/xuejin/LLaMA-Factory/src/llamafactory/launcher.py", line 155, in launch
    from .train.tuner import run_exp
  File "/home/ictrek/xuejin/LLaMA-Factory/src/llamafactory/train/tuner.py", line 29, in <module>
    from ..model import load_model, load_tokenizer
  File "/home/ictrek/xuejin/LLaMA-Factory/src/llamafactory/model/__init__.py", line 15, in <module>
    from .loader import load_config, load_model, load_tokenizer
  File "/home/ictrek/xuejin/LLaMA-Factory/src/llamafactory/model/loader.py", line 33, in <module>
    from .adapter import init_adapter
  File "/home/ictrek/xuejin/LLaMA-Factory/src/llamafactory/model/adapter.py", line 24, in <module>
    from .model_utils.ktransformers import get_kt_peft_model, load_kt_peft_model
  File "/home/ictrek/xuejin/LLaMA-Factory/src/llamafactory/model/model_utils/ktransformers.py", line 39, in <module>
    from ktransformers.sft.lora import inject_lora_layer
  File "/root/miniconda3/envs/kllama/lib/python3.12/site-packages/ktransformers/sft/lora.py", line 46, in <module>
    from ktransformers.sft.peft_utils.mapping import get_peft_model
  File "/root/miniconda3/envs/kllama/lib/python3.12/site-packages/ktransformers/sft/peft_utils/mapping.py", line 8, in <module>
    from ktransformers.sft.peft_utils.lora_model import LoraModel
  File "/root/miniconda3/envs/kllama/lib/python3.12/site-packages/ktransformers/sft/peft_utils/lora_model.py", line 36, in <module>
    from ktransformers.sft.peft_utils.lora_layer import dispatch_default, LoraLayer
  File "/root/miniconda3/envs/kllama/lib/python3.12/site-packages/ktransformers/sft/peft_utils/lora_layer.py", line 16, in <module>
    from ktransformers.operators.linear import KTransformersLinear, KLinearTorch, KLinearBase
  File "/root/miniconda3/envs/kllama/lib/python3.12/site-packages/ktransformers/operators/linear.py", line 41, in <module>
    import cpuinfer_ext
ImportError: libcudart.so.11.0: cannot open shared object file: No such file or directory

请问是否是因为ktransformers-0.4.1+cu128torch28fancy-cp312-cp312-linux_x86_64.whl安装包依赖的cuda版本是11.0导致的？我需要在本地手动编译ktransformers吗？我本地的cuda版本是13.0

Reproduction

USE_KT=1 llamafactory-cli train examples/train_lora/deepseek3_lora_sft_kt.yaml

Environment

OS: linux x86_64

Nov 11 '25 08:11 sz-xuejin

请问你有没有安装这一步（无论你本身的cuda版本是多少）：conda install -y -c nvidia/label/cuda-11.8.0 cuda-runtime

Nov 12 '25 03:11 JimmyPeilinLi

确实是因为这一步没安装，现在可以了

Nov 13 '25 07:11 sz-xuejin