chatglm.cpp icon indicating copy to clipboard operation
chatglm.cpp copied to clipboard

chatglm-ggml_q4_0.bin GGML_ASSERT ggml-metal.m:1453: false

Open zwqjoy opened this issue 1 year ago • 7 comments

./build/bin/main -m ../GGUF_Models/chatglm-ggml_q4_0.bin -l 256 -p "你好" GGML_ASSERT: /Users/apple/PycharmProjects/NLPProject/chatglm.cpp/third_party/ggml/src/ggml-metal.m:1453: false

zwqjoy avatar Nov 10 '23 01:11 zwqjoy

产生报错的环境是 mac + metal 吗?

Weaxs avatar Dec 18 '23 03:12 Weaxs

同样遇到了这个问题。 输出log如下: ggml_metal_graph_compute: command buffer 0 failed with status 5

M1 pro 16GB,chatglm3-f16使用mps后端就导致了这个。

系统是Sonoma 14.2.

XuYicong avatar Dec 19 '23 18:12 XuYicong

同样遇到了这个问题。 输出log如下: ggml_metal_graph_compute: command buffer 0 failed with status 5

M1 pro 16GB,chatglm3-f16使用mps后端就导致了这个。

系统是Sonoma 14.2.

执行一下 uname -spm 看下,顺便cmake的日志方便贴一下吗

Weaxs avatar Dec 20 '23 02:12 Weaxs

执行一下 uname -spm 看下,顺便cmake的日志方便贴一下吗

Darwin arm64 arm

cmake在终端的输出:

-- The CXX compiler identification is AppleClang 15.0.0.15000100
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
CMake Deprecation Warning at third_party/ggml/CMakeLists.txt:1 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- The C compiler identification is AppleClang 15.0.0.15000100
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- CMAKE_SYSTEM_PROCESSOR: arm64
-- ARM detected
-- Accelerate framework found
CMake Warning (dev) at third_party/ggml/src/CMakeLists.txt:322 (install):
  Target ggml has RESOURCE files but no RESOURCE DESTINATION.
This warning is for project developers.  Use -Wno-dev to suppress it.

CMake Deprecation Warning at third_party/sentencepiece/CMakeLists.txt:15 (cmake_minimum_required):
  Compatibility with CMake < 3.5 will be removed from a future version of
  CMake.

  Update the VERSION argument <min> value or use a ...<max> suffix to tell
  CMake that the project does not need compatibility with older versions.


-- VERSION: 0.2.00
-- Configuring done (1.1s)
-- Generating done (0.0s)
-- Build files have been written to: /Users/xyct/llm/chatglm.cpp/build

顺便编译警告里有个这个不知道有没有关系:

/Users/xyct/llm/chatglm.cpp/third_party/ggml/src/ggml.c:11895:17: warning: 'cblas_sgemm' is deprecated: first deprecated in macOS 13.3 - An updated CBLAS interface supporting ILP64 is available.  Please compile with -DACCELERATE_NEW_LAPACK to access the new headers and -DACCELERATE_LAPACK_ILP64 for ILP64 support. [-Wdeprecated-declarations]
                cblas_sgemm(CblasRowMajor, CblasNoTrans, CblasTrans,
                ^
/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX14.2.sdk/System/Library/Frameworks/vecLib.framework/Headers/cblas.h:610:6: note: 'cblas_sgemm' has been explicitly marked deprecated here
void cblas_sgemm(const enum CBLAS_ORDER __Order,
     ^

XuYicong avatar Dec 20 '23 02:12 XuYicong

这个warning应该没关系,方便再发下执行指令和执行log吗

Weaxs avatar Dec 20 '23 14:12 Weaxs

看 cmake 编译log其实主要想看 编译ggml那段的参数,你发的这个没有

Weaxs avatar Dec 20 '23 14:12 Weaxs

@Weaxs 我删除了build目录重新cmake了一遍,获取了更完整的输出更新在上面了,但好像仍然没有ggml相关的参数。

执行指令和log跟楼主的一样,但我已经删掉模型了所以不方便重新运行了...磁盘空间伤不起

但我觉得原因只是单纯的爆内存了。我用llama.cpp运行qwen-14B-Q4_K,上下文长度较短时一切正常,但设到1000左右就会发生一模一样的错误。之前运行chatglm3-f16时,从敲回车到GGML_ASSERT输出之间,大概有十几秒的等待,期间内存压力也是先攀升到顶后瞬间跌落,肯定是爆内存了。


Update:

通过修改代码,输出[ctx->command_buffers[i] error],获取到错误信息如下: Insufficient Memory (00000008:kIOGPUCommandBufferCallbackErrorOutOfMemory) 所以错误原因确实是内存不足。

修改方法:void ggml_metal_graph_compute方法开头的ctx->command_buffers[i] = [ctx->queue commandBuffer];改成

        MTLCommandBufferDescriptor* descriptor = [[MTLCommandBufferDescriptor alloc] init];
        descriptor.errorOptions = MTLCommandBufferErrorOptionEncoderExecutionStatus;
        ctx->command_buffers[i] = [ctx->queue commandBufferWithDescriptor:descriptor];
        [descriptor release];

并且在报错行之前插入:

            NSError*error = [ctx->command_buffers[i] error];
            if(error && ([ctx->command_buffers[i] errorOptions] &
                         MTLCommandBufferErrorOptionEncoderExecutionStatus)) {
                GGML_METAL_LOG_INFO("%s", error.localizedDescription.UTF8String);
            }

即可看到错误信息。 注:chatglm.cpp未设置GGML_METAL_LOG_INFO的输出回调,可能需要改成printf才有输出。

XuYicong avatar Dec 20 '23 17:12 XuYicong