chatglm.cpp
chatglm.cpp copied to clipboard
chatglm-ggml_q4_0.bin GGML_ASSERT ggml-metal.m:1453: false
./build/bin/main -m ../GGUF_Models/chatglm-ggml_q4_0.bin -l 256 -p "你好" GGML_ASSERT: /Users/apple/PycharmProjects/NLPProject/chatglm.cpp/third_party/ggml/src/ggml-metal.m:1453: false
产生报错的环境是 mac + metal 吗?
同样遇到了这个问题。
输出log如下:
ggml_metal_graph_compute: command buffer 0 failed with status 5
M1 pro 16GB,chatglm3-f16使用mps后端就导致了这个。
系统是Sonoma 14.2.
同样遇到了这个问题。 输出log如下:
ggml_metal_graph_compute: command buffer 0 failed with status 5
M1 pro 16GB,chatglm3-f16使用mps后端就导致了这个。
系统是Sonoma 14.2.
执行一下 uname -spm
看下,顺便cmake的日志方便贴一下吗
执行一下
uname -spm
看下,顺便cmake的日志方便贴一下吗
Darwin arm64 arm
cmake在终端的输出:
-- The CXX compiler identification is AppleClang 15.0.0.15000100
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Deprecation Warning at third_party/ggml/CMakeLists.txt:1 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
-- The C compiler identification is AppleClang 15.0.0.15000100
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- CMAKE_SYSTEM_PROCESSOR: arm64
-- ARM detected
-- Accelerate framework found
CMake Warning (dev) at third_party/ggml/src/CMakeLists.txt:322 (install):
Target ggml has RESOURCE files but no RESOURCE DESTINATION.
This warning is for project developers. Use -Wno-dev to suppress it.
CMake Deprecation Warning at third_party/sentencepiece/CMakeLists.txt:15 (cmake_minimum_required):
Compatibility with CMake < 3.5 will be removed from a future version of
CMake.
Update the VERSION argument <min> value or use a ...<max> suffix to tell
CMake that the project does not need compatibility with older versions.
-- VERSION: 0.2.00
-- Configuring done (1.1s)
-- Generating done (0.0s)
-- Build files have been written to: /Users/xyct/llm/chatglm.cpp/build
顺便编译警告里有个这个不知道有没有关系:
/Users/xyct/llm/chatglm.cpp/third_party/ggml/src/ggml.c:11895:17: warning: 'cblas_sgemm' is deprecated: first deprecated in macOS 13.3 - An updated CBLAS interface supporting ILP64 is available. Please compile with -DACCELERATE_NEW_LAPACK to access the new headers and -DACCELERATE_LAPACK_ILP64 for ILP64 support. [-Wdeprecated-declarations]
cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasTrans,
^
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/System/Library/Frameworks/vecLib.framework/Headers/cblas.h:610:6: note: 'cblas_sgemm' has been explicitly marked deprecated here
void cblas_sgemm(const enum CBLAS_ORDER __Order,
^
这个warning应该没关系,方便再发下执行指令和执行log吗
看 cmake 编译log其实主要想看 编译ggml那段的参数,你发的这个没有
@Weaxs 我删除了build目录重新cmake了一遍,获取了更完整的输出更新在上面了,但好像仍然没有ggml相关的参数。
执行指令和log跟楼主的一样,但我已经删掉模型了所以不方便重新运行了...磁盘空间伤不起
但我觉得原因只是单纯的爆内存了。我用llama.cpp运行qwen-14B-Q4_K,上下文长度较短时一切正常,但设到1000左右就会发生一模一样的错误。之前运行chatglm3-f16时,从敲回车到GGML_ASSERT输出之间,大概有十几秒的等待,期间内存压力也是先攀升到顶后瞬间跌落,肯定是爆内存了。
Update:
通过修改代码,输出[ctx->command_buffers[i] error]
,获取到错误信息如下:
Insufficient Memory (00000008:kIOGPUCommandBufferCallbackErrorOutOfMemory)
所以错误原因确实是内存不足。
修改方法:void ggml_metal_graph_compute
方法开头的ctx->command_buffers[i] = [ctx->queue commandBuffer];
改成
MTLCommandBufferDescriptor* descriptor = [[MTLCommandBufferDescriptor alloc] init];
descriptor.errorOptions = MTLCommandBufferErrorOptionEncoderExecutionStatus;
ctx->command_buffers[i] = [ctx->queue commandBufferWithDescriptor:descriptor];
[descriptor release];
并且在报错行之前插入:
NSError*error = [ctx->command_buffers[i] error];
if(error && ([ctx->command_buffers[i] errorOptions] &
MTLCommandBufferErrorOptionEncoderExecutionStatus)) {
GGML_METAL_LOG_INFO("%s", error.localizedDescription.UTF8String);
}
即可看到错误信息。
注:chatglm.cpp未设置GGML_METAL_LOG_INFO
的输出回调,可能需要改成printf
才有输出。