llama-gpt
llama-gpt copied to clipboard
warning: failed to mlock 86016-byte buffer (after previously locking 0 bytes): Cannot allocate memory
Running into this mlock issue when using docker-compose up command with the cloned repository docker-compose.yml file. Did a git pull today for the repo.
All I found on the subject is from this thread in another repository: https://github.com/abetlen/llama-cpp-python/issues/254
I adjusted the server mlock to encompass the memory size limit, and now I tried using "ulimit -l unlimited."
Completely removed and re-built the images using "docker system prune -a --volumes" after making this change and still get the mlock error output on docker-compose up.
Console error output:
llama-gpt-api-7b_1 | /usr/local/lib/python3.11/site-packages/pydantic/_internal/fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model".
llama-gpt-api-7b_1 |
llama-gpt-api-7b_1 | You may be able to resolve this warning by setting model_config['protected_namespaces'] = ('settings_',)
.
llama-gpt-api-7b_1 | warnings.warn(
llama-gpt-api-7b_1 | llama.cpp: loading model from /models/llama-2-7b-chat.bin
llama-gpt-api-7b_1 | llama_model_load_internal: format = ggjt v3 (latest)
llama-gpt-api-7b_1 | llama_model_load_internal: n_vocab = 32000
llama-gpt-api-7b_1 | llama_model_load_internal: n_ctx = 4096
llama-gpt-api-7b_1 | llama_model_load_internal: n_embd = 4096
llama-gpt-api-7b_1 | llama_model_load_internal: n_mult = 5504
llama-gpt-api-7b_1 | llama_model_load_internal: n_head = 32
llama-gpt-api-7b_1 | llama_model_load_internal: n_head_kv = 32
llama-gpt-api-7b_1 | llama_model_load_internal: n_layer = 32
llama-gpt-api-7b_1 | llama_model_load_internal: n_rot = 128
llama-gpt-api-7b_1 | llama_model_load_internal: n_gqa = 1
llama-gpt-api-7b_1 | llama_model_load_internal: rnorm_eps = 5.0e-06
llama-gpt-api-7b_1 | llama_model_load_internal: n_ff = 11008
llama-gpt-api-7b_1 | llama_model_load_internal: freq_base = 10000.0
llama-gpt-api-7b_1 | llama_model_load_internal: freq_scale = 1
llama-gpt-api-7b_1 | llama_model_load_internal: ftype = 2 (mostly Q4_0)
llama-gpt-api-7b_1 | llama_model_load_internal: model size = 7B
llama-gpt-api-7b_1 | llama_model_load_internal: ggml ctx size = 0.08 MB
llama-gpt-api-7b_1 | warning: failed to mlock 86016-byte buffer (after previously locking 0 bytes): Cannot allocate memory
llama-gpt-api-7b_1 | Try increasing RLIMIT_MLOCK ('ulimit -l' as root).
llama-gpt-api-7b_1 | error loading model: llama.cpp: tensor 'layers.30.ffn_norm.weight' is missing from model
llama-gpt-api-7b_1 | llama_load_model_from_file: failed to load model
llama-gpt-api-7b_1 | Traceback (most recent call last):
llama-gpt-api-7b_1 | File "
Yeah I've been having this issue all day.
I tried increasing the memory using docker update --memory
and docker update --memory-swap
but I'm not that experienced with docker and a little out of my depth.
Check the model file. The file maybe not fully downloaded