Unable to start pre-defined models in LLaMA C/C++ (Local)
What happened?
Unable to start pre-defined models in LLaMA C/C++ (Local) in Mac Pro M2
Relevant log output or stack trace
Steps to reproduce
No response
CodeGPT version
2.16.0-241.1
Operating System
macOS
If you brew install llama.cpp and then run the server locally (after downloading one of the models): llama-server -m ~/.codegpt/models/gguf/qwen2.5-coder-1.5b-instruct-q8_0.gguf --port 51150 you can start using it. This doesn't really answer the question but in case anyone's stuck and wants to play with a self hosted model.
@carlrobertoh any ETA for this being resolved? or do you know why it is happening? I cannot tell from any logs what is going on or why it fails..
I noticed this behaviour after I upgraded the llama.cpp submodule. However, it doesn't seem to happen when running the extension locally. I haven't had time to dive into this yet.
I don't know whether it's exactly the same issue, but i also have the issue that the server is not starting, with the following error:
Cannot run program "cmake" (in directory "...../plugins/ProxyAI/llama.cpp"): error=2, No such file or directory
In my case the latest version of the plugin from the Jetbrains plugin repo (https://plugins.jetbrains.com/plugin/21056-proxy-ai) I see the following error in the logs.. looks like the engine compiles but will not run the model - the error keeps returning unknown type name 'block_q4_0'
2025-03-27 09:07:03,440 [ 312846] INFO - #ee.carlrobert.codegpt.completions.llama.LlamaServerAgent - llama_init_from_model: n_ctx_per_seq (2048) < n_ctx_train (8192) -- the full capacity of the model will not be utilized
2025-03-27 09:07:03,440 [ 312846] INFO - #ee.carlrobert.codegpt.completions.llama.LlamaServerAgent - ggml_metal_init: allocating
2025-03-27 09:07:03,440 [ 312846] INFO - #ee.carlrobert.codegpt.completions.llama.LlamaServerAgent - ggml_metal_init: found device: Apple M4 Pro
2025-03-27 09:07:03,440 [ 312846] INFO - #ee.carlrobert.codegpt.completions.llama.LlamaServerAgent - ggml_metal_init: picking default device: Apple M4 Pro
2025-03-27 09:07:03,441 [ 312847] INFO - #ee.carlrobert.codegpt.completions.llama.LlamaServerAgent - ggml_metal_init: using embedded metal library
2025-03-27 09:07:03,587 [ 312993] INFO - #ee.carlrobert.codegpt.completions.llama.LlamaServerAgent - ggml_metal_init: error: Error Domain=MTLLibraryErrorDomain Code=3 "program_source:61:35: error: unknown type name 'block_q4_0'
2025-03-27 09:07:03,587 [ 312993] INFO - #ee.carlrobert.codegpt.completions.llama.LlamaServerAgent - void dequantize_q4_0(device const block_q4_0 * xb, short il, thread type4x4 & reg) {
2025-03-27 09:07:03,587 [ 312993] INFO - #ee.carlrobert.codegpt.completions.llama.LlamaServerAgent - ^
2025-03-27 09:07:03,587 [ 312993] INFO - #ee.carlrobert.codegpt.completions.llama.LlamaServerAgent - program_source:80:38: error: unknown type name 'block_q4_0'
2025-03-27 09:07:03,587 [ 312993] INFO - #ee.carlrobert.codegpt.completions.llama.LlamaServerAgent - void dequantize_q4_0_t4(device const block_q4_0 * xb, short il, thread type4 & reg) {
2025-03-27 09:07:03,587 [ 312993] INFO - #ee.carlrobert.codegpt.completions.llama.LlamaServerAgent - ^
2025-03-27 09:07:03,587 [ 312993] INFO - #ee.carlrobert.codegpt.completions.llama.LlamaServerAgent - program_source:95:35: error: unknown type name 'block_q4_1'
2025-03-27 09:07:03,587 [ 312993] INFO - #ee.carlrobert.codegpt.completions.llama.LlamaServerAgent - void dequantize_q4_1(device const block_q4_1 * xb, short il, thread type4x4 & reg) {
2025-03-27 09:07:03,587 [ 312993] INFO - #ee.carlrobert.codegpt.completions.llama.LlamaServerAgent -
Got the same issue today, plugin version 3.2.5-241.1
This will be fixed in the next version along with some other improvements.
https://github.com/user-attachments/assets/fdbcbdb6-8a8b-4042-8791-021f0186ed00