belog2867

Results 2 issues of belog2867

The model was loaded twice, and the 1B llama model took up more than 10g of ram, is this normal [新建 文本文档.txt](https://github.com/user-attachments/files/18723996/default.txt)

### Name and Version ggml_opencl: using kernels optimized for Adreno (GGML_OPENCL_USE_ADRENO_KERNELS) version: 4727 (c2ea16f2) built with Android (11349228, +pgo, +bolt, +lto, -mlgo, based on r487747e) clang version 17.0.2 (https://android.googlesource.com/toolchain/llvm-project d9f89f4d16663d5012e5c09495f3b30ece3d2362)...

bug-unconfirmed