llama-gpt icon indicating copy to clipboard operation
llama-gpt copied to clipboard

warning: failed to mlock 86016-byte buffer (after previously locking 0 bytes): Cannot allocate memory

Open Agility0493 opened this issue 1 year ago • 2 comments

Running into this mlock issue when using docker-compose up command with the cloned repository docker-compose.yml file. Did a git pull today for the repo.

All I found on the subject is from this thread in another repository: https://github.com/abetlen/llama-cpp-python/issues/254

I adjusted the server mlock to encompass the memory size limit, and now I tried using "ulimit -l unlimited."

Completely removed and re-built the images using "docker system prune -a --volumes" after making this change and still get the mlock error output on docker-compose up.

Console error output: llama-gpt-api-7b_1 | /usr/local/lib/python3.11/site-packages/pydantic/_internal/fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model". llama-gpt-api-7b_1 | llama-gpt-api-7b_1 | You may be able to resolve this warning by setting model_config['protected_namespaces'] = ('settings_',). llama-gpt-api-7b_1 | warnings.warn( llama-gpt-api-7b_1 | llama.cpp: loading model from /models/llama-2-7b-chat.bin llama-gpt-api-7b_1 | llama_model_load_internal: format = ggjt v3 (latest) llama-gpt-api-7b_1 | llama_model_load_internal: n_vocab = 32000 llama-gpt-api-7b_1 | llama_model_load_internal: n_ctx = 4096 llama-gpt-api-7b_1 | llama_model_load_internal: n_embd = 4096 llama-gpt-api-7b_1 | llama_model_load_internal: n_mult = 5504 llama-gpt-api-7b_1 | llama_model_load_internal: n_head = 32 llama-gpt-api-7b_1 | llama_model_load_internal: n_head_kv = 32 llama-gpt-api-7b_1 | llama_model_load_internal: n_layer = 32 llama-gpt-api-7b_1 | llama_model_load_internal: n_rot = 128 llama-gpt-api-7b_1 | llama_model_load_internal: n_gqa = 1 llama-gpt-api-7b_1 | llama_model_load_internal: rnorm_eps = 5.0e-06 llama-gpt-api-7b_1 | llama_model_load_internal: n_ff = 11008 llama-gpt-api-7b_1 | llama_model_load_internal: freq_base = 10000.0 llama-gpt-api-7b_1 | llama_model_load_internal: freq_scale = 1 llama-gpt-api-7b_1 | llama_model_load_internal: ftype = 2 (mostly Q4_0) llama-gpt-api-7b_1 | llama_model_load_internal: model size = 7B llama-gpt-api-7b_1 | llama_model_load_internal: ggml ctx size = 0.08 MB llama-gpt-api-7b_1 | warning: failed to mlock 86016-byte buffer (after previously locking 0 bytes): Cannot allocate memory llama-gpt-api-7b_1 | Try increasing RLIMIT_MLOCK ('ulimit -l' as root). llama-gpt-api-7b_1 | error loading model: llama.cpp: tensor 'layers.30.ffn_norm.weight' is missing from model llama-gpt-api-7b_1 | llama_load_model_from_file: failed to load model llama-gpt-api-7b_1 | Traceback (most recent call last): llama-gpt-api-7b_1 | File "", line 198, in _run_module_as_main llama-gpt-api-7b_1 | File "", line 88, in _run_code llama-gpt-api-7b_1 | File "/app/llama_cpp/server/main.py", line 46, in llama-gpt-api-7b_1 | app = create_app(settings=settings) llama-gpt-api-7b_1 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ llama-gpt-api-7b_1 | File "/app/llama_cpp/server/app.py", line 317, in create_app llama-gpt-api-7b_1 | llama = llama_cpp.Llama( llama-gpt-api-7b_1 | ^^^^^^^^^^^^^^^^ llama-gpt-api-7b_1 | File "/app/llama_cpp/llama.py", line 328, in init llama-gpt-api-7b_1 | assert self.model is not None llama-gpt-api-7b_1 | ^^^^^^^^^^^^^^^^^^^^^^ llama-gpt-api-7b_1 | AssertionError llama-gpt_llama-gpt-api-7b_1 exited with code 1

Agility0493 avatar Aug 21 '23 16:08 Agility0493

Yeah I've been having this issue all day. I tried increasing the memory using docker update --memory and docker update --memory-swap but I'm not that experienced with docker and a little out of my depth.

VivaWolf avatar Aug 21 '23 20:08 VivaWolf

Check the model file. The file maybe not fully downloaded

seahurt avatar Aug 30 '23 04:08 seahurt