llama.cpp icon indicating copy to clipboard operation
llama.cpp copied to clipboard

[CANN] Simplify the environment variable setting for GGML_CANN_MEM_POOL and GGML_CANN_ASYNC_MODE

Open bachelor-dou opened this issue 7 months ago • 1 comments

This PR simplifies the environment variable setup for GGML_CANN_MEM_POOL and GGML_CANN_ASYNC_MODE.

GGML_CANN_MEM_POOL:

  • default VMM
  • By setting export GGML_CANN_MEM_POOL="pRio" (the value is case-insensitive), you specify the use of a priority queue-based memory pool.
  • export GGML_CANN_MEM_POOL="legacy" or any other value will enable the legacy pool.

GGML_CANN_ASYNC_MODE:

  • Yes, enable, y, 1, on (case insensitive) are all valid values to enable GGML_CANN_ASYNC_MODE, such as export GGML_CANN_ASYNC_MODE=yEs.

bachelor-dou avatar Apr 25 '25 08:04 bachelor-dou

@hipudding

bachelor-dou avatar Apr 25 '25 09:04 bachelor-dou

Update doc.

hipudding avatar May 19 '25 03:05 hipudding

GGML_CANN_ASYNC_MODE

Enables asynchronous operator submission. Disabled by default.

GGML_CANN_MEM_POOL

Specifies the memory pool management strategy:

  • vmm: Utilizes a virtual memory manager pool. If hardware support for VMM is unavailable, falls back to the legacy (leg) memory pool.
  • prio: Employs a priority queue-based memory pool management.
  • leg: Uses a fixed-size buffer pool.

GGML_CANN_DISABLE_BUF_POOL_CLEAN

Controls automatic cleanup of the memory pool. This option is only effective when using the prio or leg memory pool strategies.

hipudding avatar May 23 '25 08:05 hipudding