ktransformers icon indicating copy to clipboard operation
ktransformers copied to clipboard

0.3版本报错

Open txg1550759 opened this issue 10 months ago • 19 comments

环境: cuda12.6 python3.11 pytorch/pytorch:2.6.0-cuda12.6-cudnn9-devel

root@b717ad53c9c5:/workspace/ktransformers# strings /usr/local/gcc-13.2.0/lib64/libstdc++.so.6 | grep '^GLIBCXX_3.4.32' GLIBCXX_3.4.32 GLIBCXX_3.4.32 root@b717ad53c9c5:/workspace/ktransformers# pip list |grep ktran ktransformers 0.3.0rc0+cu126torch26fancy

看了官方说0.3版本性能有提升,就尝试构建0.3的环境

看下面,环境都满足啊

最后卡在这个报错:ImportError: /opt/conda/lib/python3.11/site-packages/KTransformersOps.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN3c106detail23torchInternalAssertFailEPKcS2_jS2_RKSs

ai分析是pytorch 版本有问题?

大家0.3版本都测试成功没有?

结果报错: root@b717ad53c9c5:/workspace/ktransformers# python -m ktransformers.local_chat --gguf_path "/models" --model_path "/models" --cpu_infer 72 --max_new_tokens 20000 --optimize_rule_path ./ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu.yaml Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "/workspace/ktransformers/ktransformers/local_chat.py", line 25, in from ktransformers.optimize.optimize import optimize_and_load_gguf File "/workspace/ktransformers/ktransformers/optimize/optimize.py", line 15, in from ktransformers.util.custom_gguf import GGUFLoader, translate_name_to_gguf File "/workspace/ktransformers/ktransformers/util/custom_gguf.py", line 27, in import KTransformersOps ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.32' not found (required by /opt/conda/lib/python3.11/site-packages/KTransformersOps.cpython-311-x86_64-linux-gnu.so)

升级gcc支持到GLIBCXX_3.4.32,后面没有报GLIBCXX_3.4.32的问题:

#!/bin/bash

步骤 1: 安装编译所需的依赖

echo "正在安装编译所需的依赖..." apt-get update apt-get install -y build-essential wget flex bison gawk

步骤 2: 下载 GCC 源码

echo "正在下载 GCC 13.2.0 源码..." wget https://ftp.gnu.org/gnu/gcc/gcc-13.2.0/gcc-13.2.0.tar.gz

步骤 3: 解压源码包

echo "正在解压 GCC 13.2.0 源码包..." tar -xzvf gcc-13.2.0.tar.gz cd gcc-13.2.0

步骤 4: 下载 GCC 依赖库

echo "正在下载 GCC 依赖库..." ./contrib/download_prerequisites

步骤 5: 创建编译目录并进入

echo "正在创建编译目录并进入..." mkdir build cd build

步骤 6: 配置编译选项

echo "正在配置编译选项..." ../configure --enable-languages=c,c++ --disable-multilib --prefix=/usr/local/gcc-13.2.0

步骤 7: 编译 GCC

echo "开始编译 GCC,这可能需要一些时间..." make -j$(nproc)

步骤 8: 安装 GCC

echo "正在安装 GCC..." make install

步骤 9: 更新环境变量

echo "正在更新环境变量..." echo 'export PATH=/usr/local/gcc-13.2.0/bin:$PATH' >> ~/.bashrc echo 'export LD_LIBRARY_PATH=/usr/local/gcc-13.2.0/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc source ~/.bashrc

步骤 10: 验证安装

echo "正在验证安装..." strings /usr/local/gcc-13.2.0/lib64/libstdc++.so.6 | grep '^GLIBCXX_3.4.32'

echo "GCC 13.2.0 安装完成!"

又报: root@b717ad53c9c5:/workspace/ktransformers# python -m ktransformers.local_chat --gguf_path "/models" --model_path "/models" --cpu_infer 72 --max_new_tokens 20000 --optimize_rule_path ./ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu.yaml Traceback (most recent call last): File "", line 198, in _run_module_as_main File "", line 88, in _run_code File "/workspace/ktransformers/ktransformers/local_chat.py", line 25, in from ktransformers.optimize.optimize import optimize_and_load_gguf File "/workspace/ktransformers/ktransformers/optimize/optimize.py", line 15, in from ktransformers.util.custom_gguf import GGUFLoader, translate_name_to_gguf File "/workspace/ktransformers/ktransformers/util/custom_gguf.py", line 27, in import KTransformersOps ImportError: /opt/conda/lib/python3.11/site-packages/KTransformersOps.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN3c106detail23torchInternalAssertFailEPKcS2_jS2_RKSs

txg1550759 avatar Feb 22 '25 05:02 txg1550759

这里提供了一些预编译的二进制文件供你尝试,以及详细的简易编译指南。 https://github.com/ubergarm/r1-ktransformers-guide/blob/main/README.zh.md

ubergarm avatar Feb 22 '25 06:02 ubergarm

这里提供了一些预编译的二进制文件供你尝试,以及详细的简易编译指南。 https://github.com/ubergarm/r1-ktransformers-guide/blob/main/README.zh.md

Thank you. I see that the link you provided is for version 0.21. The version 0.21 has already been running successfully before. However, I need the construction documents for version 0.3, including the Dockerfile of version 0.3. I'm not sure if you can offer any assistance. 谢谢,我看你给的链接是0.21,0.21之前已经跑起来了,但是我需要0.3的构建文档,包知0.3版本的dockerfile,不知道你能提供帮助吗?

txg1550759 avatar Feb 22 '25 06:02 txg1550759

我重新编译了ktransformersops,然后替换了原来的,解决了问题,可以试试

  1. 下载main的源码包,解压进入主目录。
  2. cd ktransformers/ktransformers/ktransformers_ext/cuda/
  3. python setup.py develop
  4. ln -sf ktransformers/ktransformers/ktransformers_ext/cuda/KTransformersOps.cpython-311-x86_64-linux-gnu.so $YOURCONDAPATH/lib/python3.11/site-packages/KTransformersOps.cpython-311-x86_64-linux-gnu.so

zc129 avatar Feb 22 '25 12:02 zc129

我重新编译了ktransformersops,然后替换了原来的,解决了问题,可以试试

  1. 下载main的源码包,解压进入主目录。
  2. cd ktransformers/ktransformers/ktransformers_ext/cuda/
  3. python setup.py develop
  4. ln -sf ktransformers/ktransformers/ktransformers_ext/cuda/KTransformersOps.cpython-311-x86_64-linux-gnu.so $YOURCONDAPATH/lib/python3.11/site-packages/KTransformersOps.cpython-311-x86_64-linux-gnu.so

谢谢你,可以运行了加载模型了。

但是又报 看起来是cpu amx指令问题? 我的cpu是Intel(R) Xeon(R) Gold 5320 不支持amx, 只支持avx, 怎么搞,不支持amx的cpu能跑0.3吗?

看起来cpu是支持avx512F的: flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 invpcid_single intel_pt ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq md_clear pconfig spec_ctrl intel_stibp flush_l1d arch_capabilities

报错如下:

root@a9046bcba61e:/workspace/ktransformers# python -m ktransformers.local_chat --gguf_path "/models" --model_path "/models" --cpu_infer 56 --max_new_tokens 20000 --optimize_rule_path ./ktransformers/optimize/optimize_rule flashinfer not found, use triton for linux using custom modeling_xxx.py. AVX512F Injecting model as ktransformers.operators.models . KDeepseekV2Model Injecting model.embed_tokens as default Injecting model.layers as default Injecting model.layers.0 as default Injecting model.layers.0.self_attn as ktransformers.operators.attention . KDeepseekV2Attention Illegal instruction (core dumped)

txg1550759 avatar Feb 22 '25 13:02 txg1550759

我重新编译了ktransformersops,然后替换了原来的,解决了问题,可以试试

  1. 下载main的源码包,解压进入主目录。
  2. cd ktransformers/ktransformers/ktransformers_ext/cuda/
  3. python setup.py develop
  4. ln -sf ktransformers/ktransformers/ktransformers_ext/cuda/KTransformersOps.cpython-311-x86_64-linux-gnu.so $YOURCONDAPATH/lib/python3.11/site-packages/KTransformersOps.cpython-311-x86_64-linux-gnu.so

谢谢你,可以运行了加载模型了。

但是又报 看起来是cpu amx指令问题? 我的cpu是Intel(R) Xeon(R) Gold 5320 不支持amx, 只支持avx, 怎么搞,不支持amx的cpu能跑0.3吗?

看起来cpu是支持avx512F的: flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc art arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch epb cat_l3 invpcid_single intel_pt ssbd mba ibrs ibpb stibp ibrs_enhanced tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a avx512f avx512dq rdseed adx smap avx512ifma clflushopt clwb avx512cd sha_ni avx512bw avx512vl xsaveopt xsavec xgetbv1 cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm ida arat pln pts avx512vbmi umip pku ospke avx512_vbmi2 gfni vaes vpclmulqdq avx512_vnni avx512_bitalg avx512_vpopcntdq md_clear pconfig spec_ctrl intel_stibp flush_l1d arch_capabilities

报错如下:

root@a9046bcba61e:/workspace/ktransformers# python -m ktransformers.local_chat --gguf_path "/models" --model_path "/models" --cpu_infer 56 --max_new_tokens 20000 --optimize_rule_path ./ktransformers/optimize/optimize_rule flashinfer not found, use triton for linux using custom modeling_xxx.py. AVX512F Injecting model as ktransformers.operators.models . KDeepseekV2Model Injecting model.embed_tokens as default Injecting model.layers as default Injecting model.layers.0 as default Injecting model.layers.0.self_attn as ktransformers.operators.attention . KDeepseekV2Attention Illegal instruction (core dumped)

我的CPU也不支持。 lscpu | grep -i amx 可以看是否支持,或者查询intel官网。5320似乎不支持AMX

zc129 avatar Feb 22 '25 15:02 zc129

“AMX(高级矩阵扩展指令集)由英特尔于2020年6月推出,并首次在面向至强(Xeon)服务器的Sapphire Rapids微架构中提供支持,该架构于2023年1月正式发布。”

请查阅我的指南以获取确切的CPU功能标识配置。

ubergarm avatar Feb 22 '25 16:02 ubergarm

@txg1550759

谢谢,我看你给的链接是0.21,0.21之前已经跑起来了,但是我需要0.3的构建文档,包知0.3版本的dockerfile,不知道你能提供帮助吗?

不,我提供的 .whl 文件虽然比 0.2.1 新,但由于 Python 配置问题,本项目版本号并不规范。该文件基于 ktransformers@25c5bdd 构建(比 0.2.1 的 ktransformers@65d73ea 更新)。你可通过 git rev-parse --short HEAD 查看当前 Git 提交哈希值——版本号本身并无实际意义。

据我所知,v0.3 并无对应的 Git 提交记录。该版本可能是通过特殊 Intel 编译环境手动构建的(以启用对 Intel AMX 指令集的支持)。因此,若安装 0.3 的预编译包,实际使用的是包含旧 bug 的代码,但会启用 AMX 支持。

若无 Intel 专用编译环境,我本人无法实现 AMX 支持的编译。

@zc129

我重新编译了ktransformersops,然后替换了原来的,解决了问题,可以试试

这或许能解决编译问题,但并非真正的 v0.3 版本,对吗?你是否拥有支持最新 Intel Xeon 处理器 AMX 的专用编译器?

感谢大家在共同探索中的分享!


@txg1550759

谢谢,我看你给的链接是0.21,0.21之前已经跑起来了,但是我需要0.3的构建文档,包知0.3版本的dockerfile,不知道你能提供帮助吗?

No, the .whl I provide is newer than 0.2.1 but there is no good version numbers for this project because of challenges with python setup stuff. I built it from ktransformers@25c5bdd which is newer than 0.2.1 that is ktransformers@65d73ea. You can check with git rev-parse --short HEAD your exact git version, ignore the version number, it has no value.

As far as I know, the v0.3 does not have any corresponding git hash, but they built it manually on a machine that probably has special intel compiler to enable AMX support for intel. So if you install that 0.3 wheel it is actually older code with bugs but will have AMX support.

I do not know how to enable AMX support when compiling myself without the special intel build environment.

@zc129

我重新编译了ktransformersops,然后替换了原来的,解决了问题,可以试试

This might fix the compile issue, but it is not v0.3 is that correct? Do you have the special intel compiler to enable AMX support for latest Intex Xeon processors?

Thanks all for sharing in this exciting time of discovery together!

ubergarm avatar Feb 22 '25 16:02 ubergarm

@ubergarm 我的0.3编译成功了,感觉有点不对,因为代码仓库是0.21的,搞出来的感觉东西相当于是残废的,也许也能用,官方的这个包,fancy 这个好像是自定义标签,应该要指名amx, avx512等。

所以ktransformers-0.3.0rc0+cu126torch26fancy-cp311-cp311-linux_x86_64.whl 这个包根本不知道是什么cpu指令生成的。

所以为啥Intel(R) Xeon(R) Gold 5320 第三代至强cpu不能用,我估计我 把cpu换成第四代至强,或第五代能支持AMX指令的,也许能跑起来。

然后根据大家说的,根据这行代码来看是启用AMX:

root@a9046bcba61e:/workspace/ktransformers# cat preload_amx.c //preload_amx.c #include <stdio.h> #include <sys/syscall.h> #include <unistd.h>

#define ARCH_REQ_XCOMP_PERM 0x1023 #define XFEATURE_XTILEDATA 18

attribute((constructor)) void init() { if (syscall(SYS_arch_prctl, ARCH_REQ_XCOMP_PERM, XFEATURE_XTILEDATA)) { printf("\n Fail to do XFEATURE_XTILEDATA \n\n"); } else { printf("\n TILE DATA USE SET - OK \n\n"); } }

root@a9046bcba61e:/workspace/ktransformers# gcc -shared -fPIC -o preload_amx.so preload_amx.c

root@a9046bcba61e:/workspace/ktransformers# LD_PRELOAD=./preload_amx.so python -m ktransformers.local_chat --gguf_path "/models" --model_path "/models" --cpu_infer 56 --max_new_tokens 20000 --optimize_rule_path ./ktransformers/optimize/optimize_rules/DeepSeek-V3-Chat-multi-gpu.yaml

我只是根据它的包名准备以以下的对应的环境:

cuda12.6 python3.11 pytorch/pytorch:2.6.0-cuda12.6-cudnn9-devel

root@b717ad53c9c5:/workspace/ktransformers# strings /usr/local/gcc-13.2.0/lib64/libstdc++.so.6 | grep '^GLIBCXX_3.4.32' GLIBCXX_3.4.32 GLIBCXX_3.4.32 root@b717ad53c9c5:/workspace/ktransformers# pip list |grep ktran ktransformers 0.3.0rc0+cu126torch26fancy

txg1550759 avatar Feb 23 '25 07:02 txg1550759

@txg1550759

所以为啥Intel(R) Xeon(R) Gold 5320 第三代至强cpu不能用,我估计我 把cpu换成第四代至强,或第五代能支持AMX指令的,也许能跑起来。

是的,您的理解正确。据我所查,Intel Xeon Gold 5320 确实不支持AMX (高级矩阵扩展)。

AMX由英特尔于2020年6月提出,首代支持该技术的微架构为Sapphire Rapids(宝石湖),应用于2023年1月发布的至强服务器。——维基百科

另外感谢您提供关于LD_PRELOAD=./preload_amx.so的额外信息。若未来能使用支持AMX的新款英特尔至强处理器,或许无需等待专门适配的'v0.3'特别版本,即可通过此预加载方式或英特尔的专用编译器进行尝试?"


Yes that sounds right to me. The Intel Xeon Gold 5320 does not support AMX from what I can tell.

AMX was introduced by Intel in June 2020 and first supported by Intel with the Sapphire Rapids microarchitecture for Xeon servers, released in January 2023. -wikipedia

Also thanks for the extra information about LD_PRELOAD=./preload_amx.so. If I get access to a new Intel Xeon that supports AMX, then maybe I can try that without waiting for an updated "v0.3" special release using this technique or special intel compiler stuff?

ubergarm avatar Feb 23 '25 15:02 ubergarm

@txg1550759

谢谢,我看你给的链接是0.21,0.21之前已经跑起来了,但是我需要0.3的构建文档,包知0.3版本的dockerfile,不知道你能提供帮助吗?

不,我提供的 .whl 文件虽然比 0.2.1 新,但由于 Python 配置问题,本项目版本号并不规范。该文件基于 ktransformers@25c5bdd 构建(比 0.2.1 的 ktransformers@65d73ea 更新)。你可通过 git rev-parse --short HEAD 查看当前 Git 提交哈希值——版本号本身并无实际意义。

据我所知,v0.3 并无对应的 Git 提交记录。该版本可能是通过特殊 Intel 编译环境手动构建的(以启用对 Intel AMX 指令集的支持)。因此,若安装 0.3 的预编译包,实际使用的是包含旧 bug 的代码,但会启用 AMX 支持。

若无 Intel 专用编译环境,我本人无法实现 AMX 支持的编译。

@zc129

我重新编译了ktransformersops,然后替换了原来的,解决了问题,可以试试

这或许能解决编译问题,但并非真正的 v0.3 版本,对吗?你是否拥有支持最新 Intel Xeon 处理器 AMX 的专用编译器?

感谢大家在共同探索中的分享!

@txg1550759

谢谢,我看你给的链接是0.21,0.21之前已经跑起来了,但是我需要0.3的构建文档,包知0.3版本的dockerfile,不知道你能提供帮助吗?

No, the .whl I provide is newer than 0.2.1 but there is no good version numbers for this project because of challenges with python setup stuff. I built it from ktransformers@25c5bdd which is newer than 0.2.1 that is ktransformers@65d73ea. You can check with git rev-parse --short HEAD your exact git version, ignore the version number, it has no value.

As far as I know, the v0.3 does not have any corresponding git hash, but they built it manually on a machine that probably has special intel compiler to enable AMX support for intel. So if you install that 0.3 wheel it is actually older code with bugs but will have AMX support.

I do not know how to enable AMX support when compiling myself without the special intel build environment.

@zc129

我重新编译了ktransformersops,然后替换了原来的,解决了问题,可以试试

This might fix the compile issue, but it is not v0.3 is that correct? Do you have the special intel compiler to enable AMX support for latest Intex Xeon processors?

Thanks all for sharing in this exciting time of discovery together!

Yes, my CPU doesn't support AMX. My method can only temporarily solve the problem of KTransformersOps and let the program run, but I don't know what the difference is between KTransformersOps with v0.21 and 0.3.

zc129 avatar Feb 24 '25 01:02 zc129

The methods for building and one - click running the Docker image of version 0.3 are freshly out! Welcome to test them. 0.3版本docker镜镜构建、一键运行方法新鲜出炉,欢迎测试: https://github.com/txg1550759/ktransformers-v0.3-docker.git

txg1550759 avatar Feb 24 '25 02:02 txg1550759

@ubergarm

Yes that sounds right to me. The Intel Xeon Gold 5320 does not support AMX from what I can tell.

AMX was introduced by Intel in June 2020 and first supported by Intel with the Sapphire Rapids microarchitecture for Xeon servers, released in January 2023. -wikipedia

Also thanks for the extra information about LD_PRELOAD=./preload_amx.so. If I get access to a new Intel Xeon that supports AMX, then maybe I can try that without waiting for an updated "v0.3" special release using this technique or special intel compiler stuff?

The v0.3 binary is compiled with AMX instructions. Consequently, for v0.3 to run properly, your CPU must support AMX. Currently, Xeon processors of the 4th generation and later (such as SPR, EMR, and GNR) are equipped with AMX support. However, there is a bug in the v0.3 binary that causes the "Illegal instruction" error. You can follow the solution below to address this issue: https://github.com/kvcache-ai/ktransformers/issues/320

yuliao0214 avatar Feb 24 '25 04:02 yuliao0214

0.3是哪个分支啊,咋没看到。。

ymodo avatar Feb 27 '25 09:02 ymodo

@ymodo

0.3是哪个分支啊,咋没看到。。

当前内容未在公开GitHub仓库中提供,推测可能属于内部分支或标签(考虑到附带的.whl文件被关联至旧版发布页面)。据我所知,您无法 直接自行复现此版本,因其可能依赖特定的英特尔库或编译器。

我近期在双路Intel Xeon 6980P服务器上使用llama.cpp进行了纯CPU(无GPU)的基准测试,并获取了一些优化编译的建议,计划进一步尝试。https://github.com/ggml-org/llama.cpp/discussions/12088

进展迅速,未来或可通过AMX扩展指令集及支持fp8的4090+ GPU运行原始R1 fp8模型。

祝顺利!


ymodo

Which branch is 0.3? I can't seem to find it...

It does not exist on the public github, but likely it is an internal branch and tag given the binary .whl was attached to an older release page. As far as I know, you cannot reproduce it directly yourself as it likely requires some special Intel libraries or compilers.

I just did a round of benchmarking on dual socket Intel Xeon 6980P server using llama.cpp for full CPU no GPU benchmarking. I got some tips of how to compile that better which I want to try. https://github.com/ggml-org/llama.cpp/discussions/12088

Things are moving fast and it may be possible to run the original R1 fp8 model using AMX extensions and a 4090+ GPU that supports fp8 soon too.

Good luck!

ubergarm avatar Feb 27 '25 16:02 ubergarm

我重新编译了ktransformersops,然后替换了原来的,解决了问题,可以试试

  1. 下载main的源码包,解压进入主目录。
  2. cd ktransformers/ktransformers/ktransformers_ext/cuda/
  3. python setup.py develop
  4. ln -sf ktransformers/ktransformers/ktransformers_ext/cuda/KTransformersOps.cpython-311-x86_64-linux-gnu.so $YOURCONDAPATH/lib/python3.11/site-packages/KTransformersOps.cpython-311-x86_64-linux-gnu.so

有效,谢谢大佬

xiaohangguo avatar Mar 12 '25 07:03 xiaohangguo

我重新编译了ktransformersops,然后替换了原来的,解决了问题,可以试试

  1. 下载main的源码包,解压进入主目录。
  2. cd ktransformers/ktransformers/ktransformers_ext/cuda/
  3. python setup.py develop
  4. ln -sf ktransformers/ktransformers/ktransformers_ext/cuda/KTransformersOps.cpython-311-x86_64-linux-gnu.so $YOURCONDAPATH/lib/python3.11/site-packages/KTransformersOps.cpython-311-x86_64-linux-gnu.so

这个解决了问题 it worked

xeon 8581c faster 10%

lhtpluto avatar Mar 13 '25 08:03 lhtpluto

FOR AMD BUILD SUCCESS python ./ktransformers/local_chat.py \

--model_path /root/DeepSeek-R1
--gguf_path /root/DeepSeek-R1-UD-IQ1_S
--max_new_tokens 2048
--force_think true Traceback (most recent call last): File "/root/ktransformers/./ktransformers/local_chat.py", line 25, in from ktransformers.optimize.optimize import optimize_and_load_gguf File "/root/ktransformers/./ktransformers/optimize/optimize.py", line 15, in from ktransformers.util.custom_gguf import GGUFLoader, translate_name_to_gguf File "/root/ktransformers/./ktransformers/util/custom_gguf.py", line 27, in import KTransformersOps ImportError: /root/anaconda3/envs/ktransformers/lib/python3.11/site-packages/KTransformersOps.cpython-311-x86_64-linux-gnu.so: undefined symbol: _Z16gptq_marlin_gemmRN2at6TensorES1_S1_S1_S1_S1_llllb

mj520 avatar May 01 '25 05:05 mj520

FOR AMD BUILD SUCCESS python ./ktransformers/local_chat.py \

--model_path /root/DeepSeek-R1 --gguf_path /root/DeepSeek-R1-UD-IQ1_S --max_new_tokens 2048 --force_think true Traceback (most recent call last): File "/root/ktransformers/./ktransformers/local_chat.py", line 25, in from ktransformers.optimize.optimize import optimize_and_load_gguf File "/root/ktransformers/./ktransformers/optimize/optimize.py", line 15, in from ktransformers.util.custom_gguf import GGUFLoader, translate_name_to_gguf File "/root/ktransformers/./ktransformers/util/custom_gguf.py", line 27, in import KTransformersOps ImportError: /root/anaconda3/envs/ktransformers/lib/python3.11/site-packages/KTransformersOps.cpython-311-x86_64-linux-gnu.so: undefined symbol: _Z16gptq_marlin_gemmRN2at6TensorES1_S1_S1_S1_S1_llllb

same error

tsdcz avatar May 10 '25 12:05 tsdcz

为什么在main版本中,并没有文件路径 ktransformers/ktransformers/ktransformers_ext/cuda/

udun01 avatar Aug 28 '25 07:08 udun01