llama-cpp-python issues

Server with LangChain and Streaming throws JSONDecodeError

1

# Error When using LangChain with the server (as outlined in ```examples/notebooks/Clients.ipynb```), setting Streaming to true and setting up the handlers as in [their documentation](https://python.langchain.com/en/latest/modules/models/llms/examples/streaming_llm.html), I end up getting the...

ShreyBiswas

bug

openai

Error running on huggingface space with dockerfile

already satisfied: pydantic==1.10.7 in /usr/local/lib/python3.9/site-packages (1.10.7) zrj9h 2023-04-15T07:55:55.062Z ERROR: Could not find a version that satisfies the requirement lama-cpp-python[server] (from versions: none) zrj9h 2023-04-15T07:55:55.062Z ERROR: No matching distribution found for...

djaffer

Improved unit tests

1

diangamichael

Implement caching for evaluated prompts

27

The goal of this feature is to reduce latency for repeated calls to the chat_completion api by saving the kv_cache keyed by the prompt tokens. The basic version of this...

abetlen

enhancement

high-priority

Changing parameters once Llama is initialized makes it not consistent

Hi people 👋🏾 ! While using [langchain](https://github.com/hwchase17/langchain) and llama-cpp-python I've noticed that I had to initialise two instances of the model (one for the embeddings and another one for the...

adriacabeza

[Windows] Exception when trying to load a model

2

I try to run the sample code and get an error. It doesn't depend on the model I'm using. Code: ```python from llama_cpp import Llama llm = Llama(model_path="ggml-model.bin") output =...

Holpak

Openplayground Suport

Support for Nat Friedman's Openplayground project via the OpenAI api server. You can currently test this with ``` docker run --rm --name openplayground -e OPENAI_API_BASE= -p 5432:5432 --volumne openplayground:/web/config natorg/openplayground...

abetlen

bug

server

TypeError in low_level_api_chat_cpp.py due to Incorrect Type passed

In the file [examples/low_level_api/low_level_api_chat_cpp.py](https://github.com/abetlen/llama-cpp-python/blob/887f3b73ac16976d63c699adcb399ad63054ee74/examples/low_level_api/low_level_api_chat_cpp.py), a wrong type is returned from lines [L316-L317](https://github.com/abetlen/llama-cpp-python/blob/887f3b73ac16976d63c699adcb399ad63054ee74/examples/low_level_api/low_level_api_chat_cpp.py#L316-L317). Returned `str`, should be `llama_token` aka `c_int`. This issue subsequently causes an error in line [L358](https://github.com/abetlen/llama-cpp-python/blob/887f3b73ac16976d63c699adcb399ad63054ee74/examples/low_level_api/low_level_api_chat_cpp.py#L358). Frequency: Sometimes...

zatevakhin

Add Dockerfile + build workflow

4

Fixes #70 This PR adds a Dockerfile and updates the release workflow to build the latest Docker image too. Both amd64 and arm64 arches are built.

Niek

Issue template proposal

2

I wrote the original issue template for llama.cpp. Happy to help manage the issues a bit here, but can we have something similar, pls? --- # Prerequisites Please answer the...

gjmulder

llama-cpp-python
llama-cpp-python copied to clipboard

Metadata

Server with LangChain and Streaming throws JSONDecodeError

Error running on huggingface space with dockerfile

Improved unit tests

Implement caching for evaluated prompts

Changing parameters once Llama is initialized makes it not consistent

[Windows] Exception when trying to load a model

Openplayground Suport

TypeError in low_level_api_chat_cpp.py due to Incorrect Type passed

Add Dockerfile + build workflow

Issue template proposal

← Metadata

Owner

Metadata

llama-cpp-python llama-cpp-python copied to clipboard

Metadata

← Metadata

Owner

Metadata

llama-cpp-python
llama-cpp-python copied to clipboard