llama-cpp-python
llama-cpp-python copied to clipboard
Python bindings for llama.cpp
# Error When using LangChain with the server (as outlined in ```examples/notebooks/Clients.ipynb```), setting Streaming to true and setting up the handlers as in [their documentation](https://python.langchain.com/en/latest/modules/models/llms/examples/streaming_llm.html), I end up getting the...
already satisfied: pydantic==1.10.7 in /usr/local/lib/python3.9/site-packages (1.10.7) zrj9h 2023-04-15T07:55:55.062Z ERROR: Could not find a version that satisfies the requirement lama-cpp-python[server] (from versions: none) zrj9h 2023-04-15T07:55:55.062Z ERROR: No matching distribution found for...
The goal of this feature is to reduce latency for repeated calls to the chat_completion api by saving the kv_cache keyed by the prompt tokens. The basic version of this...
Hi people 👋🏾 ! While using [langchain](https://github.com/hwchase17/langchain) and llama-cpp-python I've noticed that I had to initialise two instances of the model (one for the embeddings and another one for the...
I try to run the sample code and get an error. It doesn't depend on the model I'm using. Code: ```python from llama_cpp import Llama llm = Llama(model_path="ggml-model.bin") output =...
Support for Nat Friedman's Openplayground project via the OpenAI api server. You can currently test this with ``` docker run --rm --name openplayground -e OPENAI_API_BASE= -p 5432:5432 --volumne openplayground:/web/config natorg/openplayground...
In the file [examples/low_level_api/low_level_api_chat_cpp.py](https://github.com/abetlen/llama-cpp-python/blob/887f3b73ac16976d63c699adcb399ad63054ee74/examples/low_level_api/low_level_api_chat_cpp.py), a wrong type is returned from lines [L316-L317](https://github.com/abetlen/llama-cpp-python/blob/887f3b73ac16976d63c699adcb399ad63054ee74/examples/low_level_api/low_level_api_chat_cpp.py#L316-L317). Returned `str`, should be `llama_token` aka `c_int`. This issue subsequently causes an error in line [L358](https://github.com/abetlen/llama-cpp-python/blob/887f3b73ac16976d63c699adcb399ad63054ee74/examples/low_level_api/low_level_api_chat_cpp.py#L358). Frequency: Sometimes...
Fixes #70 This PR adds a Dockerfile and updates the release workflow to build the latest Docker image too. Both amd64 and arm64 arches are built.
I wrote the original issue template for llama.cpp. Happy to help manage the issues a bit here, but can we have something similar, pls? --- # Prerequisites Please answer the...