llama-gpt Unable to run code-7b

After cloning and executing ./run.sh --model code-7b, llama-gpt-api keep restarting and following log keep looping.

I have tried ./run.sh --model code-13b also got similar error log, but ./run.sh (using 7b model) can run fine.

2023-09-08 18:51:33 /usr/local/lib/python3.11/site-packages/setuptools/command/develop.py:40: EasyInstallDeprecationWarning: easy_install command is deprecated.
2023-09-08 18:51:33 !!
2023-09-08 18:51:33 
2023-09-08 18:51:33         ********************************************************************************
2023-09-08 18:51:33         Please avoid running ``setup.py`` and ``easy_install``.
2023-09-08 18:51:33         Instead, use pypa/build, pypa/installer or other
2023-09-08 18:51:33         standards-based tools.
2023-09-08 18:51:33 
2023-09-08 18:51:33         See https://github.com/pypa/setuptools/issues/917 for details.
2023-09-08 18:51:33         ********************************************************************************
2023-09-08 18:51:33 
2023-09-08 18:51:33 !!
2023-09-08 18:51:33   easy_install.initialize_options(self)
2023-09-08 18:51:33 CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
2023-09-08 18:51:33   Compatibility with CMake < 3.5 will be removed from a future version of
2023-09-08 18:51:33   CMake.
2023-09-08 18:51:33 
2023-09-08 18:51:33   Update the VERSION argument <min> value or use a ...<max> suffix to tell
2023-09-08 18:51:33   CMake that the project does not need compatibility with older versions.
2023-09-08 18:51:33 
2023-09-08 18:51:33 /models/code-llama-7b-chat.gguf model found.
2023-09-08 18:51:33 python3 setup.py develop
2023-09-08 18:51:33 
2023-09-08 18:51:33 
2023-09-08 18:51:33 --------------------------------------------------------------------------------
2023-09-08 18:51:33 -- Trying 'Ninja' generator
2023-09-08 18:51:33 --------------------------------
2023-09-08 18:51:33 ---------------------------
2023-09-08 18:51:33 ----------------------
2023-09-08 18:51:33 -----------------
2023-09-08 18:51:33 ------------
2023-09-08 18:51:33 -------
2023-09-08 18:51:33 --
2023-09-08 18:51:33 Not searching for unused variables given on the command line.
2023-09-08 18:51:33 -- The C compiler identification is GNU 10.2.1
2023-09-08 18:51:33 -- Detecting C compiler ABI info
2023-09-08 18:51:33 -- Detecting C compiler ABI info - done
2023-09-08 18:51:33 -- Check for working C compiler: /usr/bin/cc - skipped
2023-09-08 18:51:33 -- Detecting C compile features
2023-09-08 18:51:33 -- Detecting C compile features - done
2023-09-08 18:51:33 -- The CXX compiler identification is GNU 10.2.1
2023-09-08 18:51:33 -- Detecting CXX compiler ABI info
2023-09-08 18:51:33 -- Detecting CXX compiler ABI info - done
2023-09-08 18:51:33 -- Check for working CXX compiler: /usr/bin/c++ - skipped
2023-09-08 18:51:33 -- Detecting CXX compile features
2023-09-08 18:51:33 -- Detecting CXX compile features - done
2023-09-08 18:51:33 -- Configuring done (0.3s)
2023-09-08 18:51:33 -- Generating done (0.0s)
2023-09-08 18:51:33 -- Build files have been written to: /app/_cmake_test_compile/build
2023-09-08 18:51:33 --
2023-09-08 18:51:33 -------
2023-09-08 18:51:33 ------------
2023-09-08 18:51:33 -----------------
2023-09-08 18:51:33 ----------------------
2023-09-08 18:51:33 ---------------------------
2023-09-08 18:51:33 
2023-09-08 18:51:33 --------------------------------
2023-09-08 18:51:33 -- Trying 'Ninja' generator - success
2023-09-08 18:51:33 --------------------------------------------------------------------------------
2023-09-08 18:51:33 
2023-09-08 18:51:33 Configuring Project
2023-09-08 18:51:33   Working directory:
2023-09-08 18:51:33     /app/_skbuild/linux-x86_64-3.11/cmake-build
2023-09-08 18:51:33   Command:
2023-09-08 18:51:33     /usr/local/lib/python3.11/site-packages/cmake/data/bin/cmake /app -G Ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/app/_skbuild/linux-x86_64-3.11/cmake-install -DPYTHON_VERSION_STRING:STRING=3.11.5 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/usr/local/lib/python3.11/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/local/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/local/include/python3.11 -DPYTHON_LIBRARY:PATH=/usr/local/lib/libpython3.11.so -DPython_EXECUTABLE:PATH=/usr/local/bin/python3 -DPython_ROOT_DIR:PATH=/usr/local -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/local/include/python3.11 -DPython_NumPy_INCLUDE_DIRS:PATH=/usr/local/lib/python3.11/site-packages/numpy-1.26.0b1-py3.11-linux-x86_64.egg/numpy/core/include -DPython3_EXECUTABLE:PATH=/usr/local/bin/python3 -DPython3_ROOT_DIR:PATH=/usr/local -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/local/include/python3.11 -DPython3_NumPy_INCLUDE_DIRS:PATH=/usr/local/lib/python3.11/site-packages/numpy-1.26.0b1-py3.11-linux-x86_64.egg/numpy/core/include -DCMAKE_BUILD_TYPE:STRING=Release
2023-09-08 18:51:33 
2023-09-08 18:51:33 Not searching for unused variables given on the command line.
2023-09-08 18:51:33 -- The C compiler identification is GNU 10.2.1
2023-09-08 18:51:33 -- The CXX compiler identification is GNU 10.2.1
2023-09-08 18:51:34 -- Detecting C compiler ABI info
2023-09-08 18:51:34 -- Detecting C compiler ABI info - done
2023-09-08 18:51:34 -- Check for working C compiler: /usr/bin/cc - skipped
2023-09-08 18:51:34 -- Detecting C compile features
2023-09-08 18:51:34 -- Detecting C compile features - done
2023-09-08 18:51:34 -- Detecting CXX compiler ABI info
2023-09-08 18:51:34 -- Detecting CXX compiler ABI info - done
2023-09-08 18:51:34 -- Check for working CXX compiler: /usr/bin/c++ - skipped
2023-09-08 18:51:34 -- Detecting CXX compile features
2023-09-08 18:51:34 -- Detecting CXX compile features - done
2023-09-08 18:51:34 -- Configuring done (0.3s)
2023-09-08 18:51:34 -- Generating done (0.0s)
2023-09-08 18:51:34 -- Build files have been written to: /app/_skbuild/linux-x86_64-3.11/cmake-build
2023-09-08 18:51:52 [1/2] Generating /app/vendor/llama.cpp/libllama.so
2023-09-08 18:51:52 make[1]: Entering directory '/app/vendor/llama.cpp'
2023-09-08 18:51:52 I llama.cpp build info: 
2023-09-08 18:51:52 I UNAME_S:  Linux
2023-09-08 18:51:52 I UNAME_P:  unknown
2023-09-08 18:51:52 I UNAME_M:  x86_64
2023-09-08 18:51:52 I CFLAGS:   -I.            -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS
2023-09-08 18:51:52 I CXXFLAGS: -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS
2023-09-08 18:51:52 I LDFLAGS:  
2023-09-08 18:51:52 I CC:       cc (Debian 10.2.1-6) 10.2.1 20210110
2023-09-08 18:51:52 I CXX:      g++ (Debian 10.2.1-6) 10.2.1 20210110
2023-09-08 18:51:52 
2023-09-08 18:51:52 g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS -c llama.cpp -o llama.o
2023-09-08 18:51:52 cc  -I.            -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS   -c ggml.c -o ggml.o
2023-09-08 18:51:52 cc -I.            -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS   -c -o k_quants.o k_quants.c
2023-09-08 18:51:52 k_quants.c:182:14: warning: ‘make_qkx1_quants’ defined but not used [-Wunused-function]
2023-09-08 18:51:52   182 | static float make_qkx1_quants(int n, int nmax, const float * restrict x, uint8_t * restrict L, float * restrict the_min,
2023-09-08 18:51:52       |              ^~~~~~~~~~~~~~~~
2023-09-08 18:51:52 cc  -I.            -O3 -std=c11   -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS   -c ggml-alloc.c -o ggml-alloc.o
2023-09-08 18:51:52 g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS -shared -fPIC -o libllama.so llama.o ggml.o k_quants.o ggml-alloc.o 
2023-09-08 18:51:52 make[1]: Leaving directory '/app/vendor/llama.cpp'
2023-09-08 18:51:52 [1/2] Install the project...
2023-09-08 18:51:52 -- Install configuration: "Release"
2023-09-08 18:51:52 -- Installing: /app/_skbuild/linux-x86_64-3.11/cmake-install/llama_cpp/libllama.so
2023-09-08 18:51:52 copying _skbuild/linux-x86_64-3.11/cmake-install/llama_cpp/libllama.so -> llama_cpp/libllama.so
2023-09-08 18:51:52 
2023-09-08 18:51:52 running develop
2023-09-08 18:51:52 /usr/local/lib/python3.11/site-packages/setuptools/command/develop.py:40: EasyInstallDeprecationWarning: easy_install command is deprecated.
2023-09-08 18:51:52 !!
2023-09-08 18:51:52 
2023-09-08 18:51:52         ********************************************************************************
2023-09-08 18:51:52         Please avoid running ``setup.py`` and ``easy_install``.
2023-09-08 18:51:52 running egg_info
2023-09-08 18:51:52 writing llama_cpp_python.egg-info/PKG-INFO
2023-09-08 18:51:52 writing dependency_links to llama_cpp_python.egg-info/dependency_links.txt
2023-09-08 18:51:52 writing requirements to llama_cpp_python.egg-info/requires.txt
2023-09-08 18:51:52 writing top-level names to llama_cpp_python.egg-info/top_level.txt
2023-09-08 18:51:52 reading manifest file 'llama_cpp_python.egg-info/SOURCES.txt'
2023-09-08 18:51:52         Instead, use pypa/build, pypa/installer or other
2023-09-08 18:51:52         standards-based tools.
2023-09-08 18:51:52 
2023-09-08 18:51:52         See https://github.com/pypa/setuptools/issues/917 for details.
2023-09-08 18:51:52         ********************************************************************************
2023-09-08 18:51:52 
2023-09-08 18:51:52 !!
2023-09-08 18:51:52   easy_install.initialize_options(self)
2023-09-08 18:51:52 /usr/local/lib/python3.11/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated.
2023-09-08 18:51:52 !!
2023-09-08 18:51:52 
2023-09-08 18:51:52         ********************************************************************************
2023-09-08 18:51:52         Please avoid running ``setup.py`` directly.
2023-09-08 18:51:52         Instead, use pypa/build, pypa/installer or other
2023-09-08 18:51:52         standards-based tools.
2023-09-08 18:51:52 
2023-09-08 18:51:52         See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
2023-09-08 18:51:52         ********************************************************************************
2023-09-08 18:51:52 
2023-09-08 18:51:52 !!
2023-09-08 18:51:52   self.initialize_options()
2023-09-08 18:51:52 adding license file 'LICENSE.md'
2023-09-08 18:51:52 writing manifest file 'llama_cpp_python.egg-info/SOURCES.txt'
2023-09-08 18:51:52 running build_ext
2023-09-08 18:51:52 Creating /usr/local/lib/python3.11/site-packages/llama-cpp-python.egg-link (link to .)
2023-09-08 18:51:52 llama-cpp-python 0.1.80 is already the active version in easy-install.pth
2023-09-08 18:51:52 
2023-09-08 18:51:52 Installed /app
2023-09-08 18:51:52 Processing dependencies for llama-cpp-python==0.1.80
2023-09-08 18:51:52 Searching for diskcache==5.6.1
2023-09-08 18:51:52 Best match: diskcache 5.6.1
2023-09-08 18:51:52 Processing diskcache-5.6.1-py3.11.egg
2023-09-08 18:51:52 Adding diskcache 5.6.1 to easy-install.pth file
2023-09-08 18:51:52 
2023-09-08 18:51:52 Using /usr/local/lib/python3.11/site-packages/diskcache-5.6.1-py3.11.egg
2023-09-08 18:51:52 Searching for numpy==1.26.0b1
2023-09-08 18:51:52 Best match: numpy 1.26.0b1
2023-09-08 18:51:52 Processing numpy-1.26.0b1-py3.11-linux-x86_64.egg
2023-09-08 18:51:52 Adding numpy 1.26.0b1 to easy-install.pth file
2023-09-08 18:51:52 Installing f2py script to /usr/local/bin
2023-09-08 18:51:52 
2023-09-08 18:51:52 Using /usr/local/lib/python3.11/site-packages/numpy-1.26.0b1-py3.11-linux-x86_64.egg
2023-09-08 18:51:52 Searching for typing-extensions==4.7.1
2023-09-08 18:51:52 Best match: typing-extensions 4.7.1
2023-09-08 18:51:52 Adding typing-extensions 4.7.1 to easy-install.pth file
2023-09-08 18:51:52 
2023-09-08 18:51:52 Using /usr/local/lib/python3.11/site-packages
2023-09-08 18:51:52 Finished processing dependencies for llama-cpp-python==0.1.80
2023-09-08 18:51:52 Initializing server with:
2023-09-08 18:51:53 /usr/local/lib/python3.11/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".
2023-09-08 18:51:53 
2023-09-08 18:51:53 You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
2023-09-08 18:51:53   warnings.warn(

Sep 08 '23 10:09 chiu0602

Seeing the same final error on code-34b as well; specifically:

/usr/local/lib/python3.11/site-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".
You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
warnings.warn(
llamagpt-llama-gpt-api-1 exited with code 139

After this it returns to output of [llama-gpt-api:8000] not yet available for a short while then repeats the attempt to run again showing above error

Sep 09 '23 22:09 s3rvant

13b is also failing (both my tests via docker on ubuntu 23.04) with:

llamagpt-llama-gpt-api-1  | error loading model: llama.cpp: tensor 'layers.23.ffn_norm.weight' is missing from model
llamagpt-llama-gpt-api-1  | llama_load_model_from_file: failed to load model
llamagpt-llama-gpt-api-1  | Traceback (most recent call last):
llamagpt-llama-gpt-api-1  |   File "<frozen runpy>", line 198, in _run_module_as_main
llamagpt-llama-gpt-api-1  |   File "<frozen runpy>", line 88, in _run_code
llamagpt-llama-gpt-api-1  |   File "/app/llama_cpp/server/__main__.py", line 46, in <module>
llamagpt-llama-gpt-api-1  |     app = create_app(settings=settings)
llamagpt-llama-gpt-api-1  |           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
llamagpt-llama-gpt-api-1  |   File "/app/llama_cpp/server/app.py", line 317, in create_app
llamagpt-llama-gpt-api-1  |     llama = llama_cpp.Llama(
llamagpt-llama-gpt-api-1  |             ^^^^^^^^^^^^^^^^
llamagpt-llama-gpt-api-1  |   File "/app/llama_cpp/llama.py", line 328, in __init__
llamagpt-llama-gpt-api-1  |     assert self.model is not None
llamagpt-llama-gpt-api-1  |            ^^^^^^^^^^^^^^^^^^^^^^
llamagpt-llama-gpt-api-1  | AssertionError
llamagpt-llama-gpt-api-1 exited with code 1

Sep 10 '23 04:09 s3rvant

I am getting the same errors with code-7b, code-13b, and code-34b. Note: the "layers.xx.ffn_norm.weight" error is, I believe, due to the model download being incomplete or otherwise damaged, so I don't think it's related to the initial problem.

Console output from last line of model code-34b download to the 139 error, at which point I left it for an hour and then killed the containers:

`100 18.8G 100 18.8G 0 0 32.0M 0 0:10:01 0:10:01 --:--:-- 32.8M [llama-gpt-api] | python3 setup.py develop [INFO wait] Host [llama-gpt-api:8000] not yet available... /usr/local/lib/python3.11/site-packages/setuptools/command/develop.py:40: EasyInstallDeprecationWarning: easy_install command is deprecated. !!

    ********************************************************************************
    Please avoid running ``setup.py`` and ``easy_install``.
    Instead, use pypa/build, pypa/installer or other
    standards-based tools.

    See https://github.com/pypa/setuptools/issues/917 for details.
    ********************************************************************************

!! easy_install.initialize_options(self) [llama-gpt-api] | [llama-gpt-api] | [llama-gpt-api] | -------------------------------------------------------------------------------- [llama-gpt-api] | -- Trying 'Ninja' generator [llama-gpt-api] | -------------------------------- [llama-gpt-api] | --------------------------- [llama-gpt-api] | ---------------------- [llama-gpt-api] | ----------------- [llama-gpt-api] | ------------ [llama-gpt-api] | ------- [llama-gpt-api] | -- CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required): Compatibility with CMake < 3.5 will be removed from a future version of CMake.

Update the VERSION argument value or use a ... suffix to tell CMake that the project does not need compatibility with older versions.

[llama-gpt-api] | Not searching for unused variables given on the command line. [llama-gpt-api] | -- The C compiler identification is GNU 10.2.1 [llama-gpt-api] | -- Detecting C compiler ABI info [llama-gpt-api] | -- Detecting C compiler ABI info - done [llama-gpt-api] | -- Check for working C compiler: /usr/bin/cc - skipped [llama-gpt-api] | -- Detecting C compile features [llama-gpt-api] | -- Detecting C compile features - done [llama-gpt-api] | -- The CXX compiler identification is GNU 10.2.1 [llama-gpt-api] | -- Detecting CXX compiler ABI info [llama-gpt-api] | -- Detecting CXX compiler ABI info - done [llama-gpt-api] | -- Check for working CXX compiler: /usr/bin/c++ - skipped [llama-gpt-api] | -- Detecting CXX compile features [llama-gpt-api] | -- Detecting CXX compile features - done [llama-gpt-api] | -- Configuring done (0.2s) [llama-gpt-api] | -- Generating done (0.0s) [llama-gpt-api] | -- Build files have been written to: /app/_cmake_test_compile/build [llama-gpt-api] | -- [llama-gpt-api] | ------- [llama-gpt-api] | ------------ [llama-gpt-api] | ----------------- [llama-gpt-api] | ---------------------- [llama-gpt-api] | --------------------------- [llama-gpt-api] | -------------------------------- [llama-gpt-api] | -- Trying 'Ninja' generator - success [llama-gpt-api] | -------------------------------------------------------------------------------- [llama-gpt-api] | [llama-gpt-api] | Configuring Project [llama-gpt-api] | Working directory: [llama-gpt-api] | /app/_skbuild/linux-x86_64-3.11/cmake-build [llama-gpt-api] | Command: [llama-gpt-api] | /usr/local/lib/python3.11/site-packages/cmake/data/bin/cmake /app -G Ninja --no-warn-unused-cli -DCMAKE_INSTALL_PREFIX:PATH=/app/_skbuild/linux-x86_64-3.11/cmake-install -DPYTHON_VERSION_STRING:STRING=3.11.5 -DSKBUILD:INTERNAL=TRUE -DCMAKE_MODULE_PATH:PATH=/usr/local/lib/python3.11/site-packages/skbuild/resources/cmake -DPYTHON_EXECUTABLE:PATH=/usr/local/bin/python3 -DPYTHON_INCLUDE_DIR:PATH=/usr/local/include/python3.11 -DPYTHON_LIBRARY:PATH=/usr/local/lib/libpython3.11.so -DPython_EXECUTABLE:PATH=/usr/local/bin/python3 -DPython_ROOT_DIR:PATH=/usr/local -DPython_FIND_REGISTRY:STRING=NEVER -DPython_INCLUDE_DIR:PATH=/usr/local/include/python3.11 -DPython_NumPy_INCLUDE_DIRS:PATH=/usr/local/lib/python3.11/site-packages/numpy-1.26.0b1-py3.11-linux-x86_64.egg/numpy/core/include -DPython3_EXECUTABLE:PATH=/usr/local/bin/python3 -DPython3_ROOT_DIR:PATH=/usr/local -DPython3_FIND_REGISTRY:STRING=NEVER -DPython3_INCLUDE_DIR:PATH=/usr/local/include/python3.11 -DPython3_NumPy_INCLUDE_DIRS:PATH=/usr/local/lib/python3.11/site-packages/numpy-1.26.0b1-py3.11-linux-x86_64.egg/numpy/core/include -DCMAKE_BUILD_TYPE:STRING=Release [llama-gpt-api] | [llama-gpt-api] | Not searching for unused variables given on the command line. [llama-gpt-api] | -- The C compiler identification is GNU 10.2.1 [llama-gpt-api] | -- The CXX compiler identification is GNU 10.2.1 [llama-gpt-api] | -- Detecting C compiler ABI info [llama-gpt-api] | -- Detecting C compiler ABI info - done [llama-gpt-api] | -- Check for working C compiler: /usr/bin/cc - skipped [llama-gpt-api] | -- Detecting C compile features [llama-gpt-api] | -- Detecting C compile features - done [llama-gpt-api] | -- Detecting CXX compiler ABI info [llama-gpt-api] | -- Detecting CXX compiler ABI info - done [llama-gpt-api] | -- Check for working CXX compiler: /usr/bin/c++ - skipped [llama-gpt-api] | -- Detecting CXX compile features [llama-gpt-api] | -- Detecting CXX compile features - done [llama-gpt-api] | -- Configuring done (0.1s) [llama-gpt-api] | -- Generating done (0.0s) [llama-gpt-api] | -- Build files have been written to: /app/_skbuild/linux-x86_64-3.11/cmake-build [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [llama-gpt-api] | [1/2] Generating /app/vendor/llama.cpp/libllama.so [llama-gpt-api] | make[1]: Entering directory '/app/vendor/llama.cpp' [llama-gpt-api] | I llama.cpp build info: [llama-gpt-api] | I UNAME_S: Linux [llama-gpt-api] | I UNAME_P: unknown [llama-gpt-api] | I UNAME_M: x86_64 [llama-gpt-api] | I CFLAGS: -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS [llama-gpt-api] | I CXXFLAGS: -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS [llama-gpt-api] | I LDFLAGS:
[llama-gpt-api] | I CC: cc (Debian 10.2.1-6) 10.2.1 20210110 [llama-gpt-api] | I CXX: g++ (Debian 10.2.1-6) 10.2.1 20210110 [llama-gpt-api] | [llama-gpt-api] | g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS -c llama.cpp -o llama.o [llama-gpt-api] | cc -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS -c ggml.c -o ggml.o [llama-gpt-api] | cc -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS -c -o k_quants.o k_quants.c [llama-gpt-api] | k_quants.c:182:14: warning: ‘make_qkx1_quants’ defined but not used [-Wunused-function] [llama-gpt-api] | 182 | static float make_qkx1_quants(int n, int nmax, const float * restrict x, uint8_t * restrict L, float * restrict the_min, [llama-gpt-api] | | ^~~~~~~~~~~~~~~~ [llama-gpt-api] | cc -I. -O3 -std=c11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wmissing-prototypes -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS -c ggml-alloc.c -o ggml-alloc.o [llama-gpt-api] | g++ -I. -I./common -O3 -std=c++11 -fPIC -DNDEBUG -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wno-multichar -pthread -march=native -mtune=native -DGGML_USE_K_QUANTS -shared -fPIC -o libllama.so llama.o ggml.o k_quants.o ggml-alloc.o [llama-gpt-api] | make[1]: Leaving directory '/app/vendor/llama.cpp' [llama-gpt-api] | [1/2] Install the project... [llama-gpt-api] | -- Install configuration: "Release" [llama-gpt-api] | -- Installing: /app/_skbuild/linux-x86_64-3.11/cmake-install/llama_cpp/libllama.so [llama-gpt-api] | copying _skbuild/linux-x86_64-3.11/cmake-install/llama_cpp/libllama.so -> llama_cpp/libllama.so [llama-gpt-api] | [llama-gpt-api] | running develop /usr/local/lib/python3.11/site-packages/setuptools/command/develop.py:40: EasyInstallDeprecationWarning: easy_install command is deprecated. !!

    ********************************************************************************
    Please avoid running ``setup.py`` and ``easy_install``.
    Instead, use pypa/build, pypa/installer or other
    standards-based tools.

    See https://github.com/pypa/setuptools/issues/917 for details.
    ********************************************************************************

!! easy_install.initialize_options(self) /usr/local/lib/python3.11/site-packages/setuptools/_distutils/cmd.py:66: SetuptoolsDeprecationWarning: setup.py install is deprecated. !!

    ********************************************************************************
    Please avoid running ``setup.py`` directly.
    Instead, use pypa/build, pypa/installer or other
    standards-based tools.

    See https://blog.ganssle.io/articles/2021/10/setup-py-deprecated.html for details.
    ********************************************************************************

!! self.initialize_options() [llama-gpt-api] | running egg_info [llama-gpt-api] | writing llama_cpp_python.egg-info/PKG-INFO [llama-gpt-api] | writing dependency_links to llama_cpp_python.egg-info/dependency_links.txt [llama-gpt-api] | writing requirements to llama_cpp_python.egg-info/requires.txt [llama-gpt-api] | writing top-level names to llama_cpp_python.egg-info/top_level.txt [llama-gpt-api] | reading manifest file 'llama_cpp_python.egg-info/SOURCES.txt' [llama-gpt-api] | adding license file 'LICENSE.md' [llama-gpt-api] | writing manifest file 'llama_cpp_python.egg-info/SOURCES.txt' [llama-gpt-api] | running build_ext [llama-gpt-api] | Creating /usr/local/lib/python3.11/site-packages/llama-cpp-python.egg-link (link to .) [llama-gpt-api] | llama-cpp-python 0.1.80 is already the active version in easy-install.pth [llama-gpt-api] | [llama-gpt-api] | Installed /app [llama-gpt-api] | Processing dependencies for llama-cpp-python==0.1.80 [llama-gpt-api] | Searching for diskcache==5.6.1 [llama-gpt-api] | Best match: diskcache 5.6.1 [llama-gpt-api] | Processing diskcache-5.6.1-py3.11.egg [llama-gpt-api] | Adding diskcache 5.6.1 to easy-install.pth file [llama-gpt-api] | [llama-gpt-api] | Using /usr/local/lib/python3.11/site-packages/diskcache-5.6.1-py3.11.egg [llama-gpt-api] | Searching for numpy==1.26.0b1 [llama-gpt-api] | Best match: numpy 1.26.0b1 [llama-gpt-api] | Processing numpy-1.26.0b1-py3.11-linux-x86_64.egg [llama-gpt-api] | Adding numpy 1.26.0b1 to easy-install.pth file [llama-gpt-api] | Installing f2py script to /usr/local/bin [llama-gpt-api] | [llama-gpt-api] | Using /usr/local/lib/python3.11/site-packages/numpy-1.26.0b1-py3.11-linux-x86_64.egg [llama-gpt-api] | Searching for typing-extensions==4.7.1 [llama-gpt-api] | Best match: typing-extensions 4.7.1 [llama-gpt-api] | Adding typing-extensions 4.7.1 to easy-install.pth file [llama-gpt-api] | [llama-gpt-api] | Using /usr/local/lib/python3.11/site-packages [llama-gpt-api] | Finished processing dependencies for llama-cpp-python==0.1.80 [llama-gpt-api] | Initializing server with: [llama-gpt-api] | Batch size: 2096 [llama-gpt-api] | Number of CPU threads: 24 [llama-gpt-api] | Number of GPU layers: 0 [llama-gpt-api] | Context window: 4096 /usr/local/lib/python3.11/site-packages/pydantic/_internal/fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model".

You may be able to resolve this warning by setting model_config['protected_namespaces'] = ('settings_',). warnings.warn( [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... [INFO wait] Host [llama-gpt-api:8000] not yet available... exit code: 139`

Sep 10 '23 12:09 simongalton

in docker-compose-gguf.yml

change

image: ghcr.io/abetlen/llama-cpp-python:latest #@sha256:de0fd227f348b5e43d4b5b7300f1344e712c14132914d1332182e9ecfde502b2

so, remove @sha256 block. (it's work on 12.09.23 :) )

Sep 12 '23 10:09 KoIIIeY

remove @sha256 block

Thank you!

Tested this with 7b, 13b, code-13b and code 34b with all working except 13b which still gives assertion error.

Sep 12 '23 13:09 s3rvant

Running Windows 10, Docker Destop with WSL2. My api-cuda-guff container starts up as expected, but then gets stuck in a loop of throwing these 'code 139's.

llama-gpt-llama-gpt-ui-1             | [INFO  wait] Host [llama-gpt-api-cuda-gguf:8000] not yet available...
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Host [llama-gpt-api-cuda-gguf:8000] not yet available...
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Host [llama-gpt-api-cuda-gguf:8000] not yet available...
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Host [llama-gpt-api-cuda-gguf:8000] not yet available...
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Host [llama-gpt-api-cuda-gguf:8000] not yet available...
llama-gpt-llama-gpt-api-cuda-gguf-1  | ggml_init_cublas: found 1 CUDA devices:
llama-gpt-llama-gpt-api-cuda-gguf-1  |   Device 0: NVIDIA GeForce RTX 3080 Ti, compute capability 8.6
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Host [llama-gpt-api-cuda-gguf:8000] not yet available...
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Host [llama-gpt-api-cuda-gguf:8000] not yet available...
llama-gpt-llama-gpt-api-cuda-gguf-1  | /usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".
llama-gpt-llama-gpt-api-cuda-gguf-1  |
llama-gpt-llama-gpt-api-cuda-gguf-1  | You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
llama-gpt-llama-gpt-api-cuda-gguf-1  |   warnings.warn(
llama-gpt-llama-gpt-api-cuda-gguf-1 exited with code 0
llama-gpt-llama-gpt-api-cuda-gguf-1  | ggml_init_cublas: found 1 CUDA devices:
llama-gpt-llama-gpt-api-cuda-gguf-1  |   Device 0: NVIDIA GeForce RTX 3080 Ti, compute capability 8.6
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Host [llama-gpt-api-cuda-gguf:8000] not yet available...
llama-gpt-llama-gpt-api-cuda-gguf-1  | /usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".
llama-gpt-llama-gpt-api-cuda-gguf-1  |
llama-gpt-llama-gpt-api-cuda-gguf-1  | You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
llama-gpt-llama-gpt-api-cuda-gguf-1  |   warnings.warn(
llama-gpt-llama-gpt-api-cuda-gguf-1 exited with code 0
llama-gpt-llama-gpt-api-cuda-gguf-1  | ggml_init_cublas: found 1 CUDA devices:
llama-gpt-llama-gpt-api-cuda-gguf-1  |   Device 0: NVIDIA GeForce RTX 3080 Ti, compute capability 8.6
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Host [llama-gpt-api-cuda-gguf:8000] not yet available...
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Host [llama-gpt-api-cuda-gguf:8000] not yet available...
llama-gpt-llama-gpt-api-cuda-gguf-1  | /usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".
llama-gpt-llama-gpt-api-cuda-gguf-1  |
llama-gpt-llama-gpt-api-cuda-gguf-1  | You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
llama-gpt-llama-gpt-api-cuda-gguf-1  |   warnings.warn(
llama-gpt-llama-gpt-api-cuda-gguf-1 exited with code 139
llama-gpt-llama-gpt-api-cuda-gguf-1  | ggml_init_cublas: found 1 CUDA devices:
llama-gpt-llama-gpt-api-cuda-gguf-1  |   Device 0: NVIDIA GeForce RTX 3080 Ti, compute capability 8.6
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Host [llama-gpt-api-cuda-gguf:8000] not yet available...
llama-gpt-llama-gpt-api-cuda-gguf-1  | /usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:127: UserWarning: Field "model_alias" has conflict with protected namespace "model_".
llama-gpt-llama-gpt-api-cuda-gguf-1  |
llama-gpt-llama-gpt-api-cuda-gguf-1  | You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
llama-gpt-llama-gpt-api-cuda-gguf-1  |   warnings.warn(
llama-gpt-llama-gpt-ui-1             | [INFO  wait] Host [llama-gpt-api-cuda-gguf:8000] not yet available...
llama-gpt-llama-gpt-api-cuda-gguf-1 exited with code 139

Everything i've tried so far to workaround this: -Updated everything - Docker Desktop, NVIDIA GPU drivers, Windows Update -As @KollleY suggested, removed the '#@sha256:' block in docker-compose-gguf.yml (which has helped fix this for a few others!), so its just this:

image: ghcr.io/abetlen/llama-cpp-python:latest

-Deleted and rebuilt all Containers and Images in Docker -Tried to start different with models (i.e. code-7b vs code-13b) -In cuda run.sh, tried various n_gpu_layers settings (40 is the sweet spot for chat models on my GPU, but dialed this back down to the default of 10) -Deleted and re-downloaded the model files

Other observations: -All chat models are working fine and blazing fast when using GPU with '--with-cuda' -If I start code-7b without '--with-cuda', it does run and does work using just CPU, albeit slow...

Any other ideas on how to resolve this?

Thanks in advance!

Sep 23 '23 20:09 arch1v1st

Got it working with all code models. In /cuda/gguf.Dockerfile, came to find this line was causing my startup issues with any llama code model while using CUDA by forcing v0.1.80 of llama-cpp-python:

RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.80

I tried to jump it straight to the latest (v0.2.6), but that through new errors after building related to:

ModuleNotFoundError: No module named 'starlette_context'

After looking at the many GitHub changes in the past few weeks, starlette and other key things were changed: https://github.com/abetlen/llama-cpp-python/releases

With v0.2.2-v0.2.4 the API docker container started up fine but every requests for any coding task was returning this 'nerfed' response:

"Sorry, I cannot assist you with that request because it is against my ethical guidelines. I am programmed to promote and support the well-being and safety of all individuals, and web scraping can be harmful or offensive to some people."

Downgraded to v0.2.1 and its working great with any model (chat or code), complete with code formatting!

RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.1

One observed issue with jumping back to 0.2.1+ is that these types of errors will sometimes get logged while making requests, as it seems the structure of the response payload has changed on the API side:

llama-gpt-llama-gpt-ui-1             | SyntaxError: Unexpected token 'D', "[DONE]" is not valid JSON

Can live with that for now, for now. :) Hope this helps someone else out too!

Sep 23 '23 23:09 arch1v1st

Got it working with all code models. In /cuda/gguf.Dockerfile, came to find this line was causing my startup issues with any llama code model while using CUDA by forcing v0.1.80 of llama-cpp-python:
RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.1.80
I tried to jump it straight to the latest (v0.2.6), but that through new errors after building related to:
ModuleNotFoundError: No module named 'starlette_context'
After looking at the many GitHub changes in the past few weeks, starlette and other key things were changed: https://github.com/abetlen/llama-cpp-python/releases

With v0.2.2-v0.2.4 the API docker container started up fine but every requests for any coding task was returning this 'nerfed' response:

"Sorry, I cannot assist you with that request because it is against my ethical guidelines. I am programmed to promote and support the well-being and safety of all individuals, and web scraping can be harmful or offensive to some people."

Downgraded to v0.2.1 and its working great with any model (chat or code), complete with code formatting!
RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.1
One observed issue with jumping back to 0.2.1+ is that these types of errors will sometimes get logged while making requests, as it seems the structure of the response payload has changed on the API side:
llama-gpt-llama-gpt-ui-1             | SyntaxError: Unexpected token 'D', "[DONE]" is not valid JSON
Can live with that for now, for now. :) Hope this helps someone else out too!

I fixed this by using your approach, but made it work with 2.6.0 by adding starlet_context to the gguf.dockerfile dependencies, this is what my dockerfile looks like now :

ARG CUDA_IMAGE="12.1.1-devel-ubuntu22.04"
FROM nvidia/cuda:${CUDA_IMAGE}

# We need to set the host to 0.0.0.0 to allow outside access
ENV HOST 0.0.0.0

RUN apt-get update && apt-get upgrade -y \
    && apt-get install -y git build-essential \
    python3 python3-pip gcc wget \
    ocl-icd-opencl-dev opencl-headers clinfo \
    libclblast-dev libopenblas-dev \
    && mkdir -p /etc/OpenCL/vendors && echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd

COPY . .

# setting build related env vars
ENV CUDA_DOCKER_ARCH=all
ENV LLAMA_CUBLAS=1

# Install depencencies
RUN python3 -m pip install --upgrade pip pytest cmake scikit-build setuptools fastapi uvicorn sse-starlette pydantic-settings starlette-context

# Install llama-cpp-python 0.1.80 which has GGUF support (build with cuda)
RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on" FORCE_CMAKE=1 pip install llama-cpp-python==0.2.6

# Run the server
CMD python3 -m llama_cpp.server

Oct 19 '23 13:10 MrQbit

@MrQbit: I think you should make a Pull Request with these changes. Works like a charm!

Oct 24 '23 10:10 efokken-abb

I made your correction, which solved the previous problem, but now I find myself with this error. Namely I am on ubuntu in WSL2 Win11.

root:/containers/llama-gpt# ./run.sh --model 7b
[+] Building 16.2s (1/1) FINISHED
 => ERROR [internal] booting buildkit                                                                             16.2s
 => => pulling image moby/buildkit:buildx-stable-1                                                                 1.4s
 => => creating container buildx_buildkit_default                                                                 14.8s
------
 > [internal] booting buildkit:
#0 16.21 time="2023-10-30T18:50:25Z" level=warning msg="using host network as the defaultime="2023-10-30T18:50:25Z" level=warning msg="using host network as the default"
#0 16.21 time="2023-10-30T18:50:25Z" level=warning msg="skipping containerd worker, as \"/run/containerd/containerd.sock\" does not exist"
#0 16.21 dtime="2023-10-30T18:50:25Z" level=info msg="found 1 workers, default=\"w5oo16t8h2xbxo8abznff0azn\""
#0 16.21 `time="2023-10-30T18:50:25Z" level=warning msg="currently, only the default worker can be used."
#0 16.21 \time="2023-10-30T18:50:25Z" level=info msg="running server on /run/buildkit/buildkitd.sock"
#0 16.21 time="2023-10-30T18:50:25Z" level=warning msg="skipping containerd worker, as \"/run/containerd/containerd.sock\" does not exist"
#0 16.21 time="2023-10-30T18:50:25Z" level=warning msg="currently, only the default worker can be used."
#0 16.21 time="2023-10-30T18:50:25Z" level=warning msg="currently, only the default worker can be used."
#0 16.21
------
http: invalid Host header

Oct 30 '23 18:10 LIncorruptible

in docker-compose-gguf.yml

change

image: ghcr.io/abetlen/llama-cpp-python:latest #@sha256:de0fd227f348b5e43d4b5b7300f1344e712c14132914d1332182e9ecfde502b2

so, remove @sha256 block. (it's work on 12.09.23 :) )

Doesn't seem to work anymore,

llama-gpt-llama-gpt-api-1  | /usr/local/bin/python3: Cannot use package as __main__ module; 'llama_cpp.server' is a package and cannot be directly executed

Nov 01 '23 19:11 LunaSquee

in docker-compose-gguf.yml

change

image: ghcr.io/abetlen/llama-cpp-python:latest #@sha256:de0fd227f348b5e43d4b5b7300f1344e712c14132914d1332182e9ecfde502b2

so, remove @sha256 block. (it's work on 12.09.23 :) )

This doesn't do anything different for me. I still get the same message:

2023-11-18 23:40:42 /usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:128: UserWarning: Field "model_alias" has conflict with protected namespace "model_".
2023-11-18 23:40:42 
2023-11-18 23:40:42 You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
2023-11-18 23:40:42   warnings.warn(

Does anyone know what file has model_alias variable to change?

Nov 19 '23 04:11 kawnah

in docker-compose-gguf.yml change image: ghcr.io/abetlen/llama-cpp-python:latest #@sha256:de0fd227f348b5e43d4b5b7300f1344e712c14132914d1332182e9ecfde502b2 so, remove @sha256 block. (it's work on 12.09.23 :) )

This doesn't do anything different for me. I still get the same message:
2023-11-18 23:40:42 /usr/local/lib/python3.10/dist-packages/pydantic/_internal/_fields.py:128: UserWarning: Field "model_alias" has conflict with protected namespace "model_".
2023-11-18 23:40:42 
2023-11-18 23:40:42 You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ('settings_',)`.
2023-11-18 23:40:42   warnings.warn(
Does anyone know what file has model_alias variable to change?

_fields.py Located somewhere in Docker on linux

Dec 20 '23 19:12 Unbelieverman

nano docker-compose-gguf.yml

image: ghcr.io/abetlen/llama-cpp-python:latest or image: ghcr.io/abetlen/llama-cpp-python:v0.2.32

./run.sh --model code-7b

Jan 23 '24 19:01 smi-ed

llama-gpt llama-gpt copied to clipboard

Unable to run code-7b

llama-gpt
llama-gpt copied to clipboard