llama-cpp-python
llama-cpp-python copied to clipboard
Python bindings for llama.cpp
When i try to install and use this package via a requirements file in the default 3.10 python container i get the following error when i try to import the...
Working version (draft) for #74 Trying to find a nice cross platform method of fetching info simply (and include dep versions) ex: ``` > npx envinfo --system --npmPackages --languages --IDEs...
I'd like to propose a future feature I think would add useful flexibility for users of the `completions/embeddings` API . I'm suggesting the ability to dynamically load models based on...
EDIT: I'm running this on an M1 Macbook. Using the model directly works as expected, but running it through Python gives me this output. The `.dylib` binary is built from...
0.1.32 is 2x slower than 0.1.27 I tried using `use_mlock=TRUE`, warned me about RLIMIT and had to `ulimit -l unlimited` temporarily, but it still didn't improve. Is anyone else getting...
Need to get structured information upfront, can still leave feature requests / etc free form.
The chat completion api specifically in fastapi wasn't doing a very consistent job in completing chat. The results seem to consistently generate gibberish (like `\nA\n/imagine prompt: User is asking about...
When the model wants to output an emoji, this error comes up: `Debugging middleware caught exception in streamed response at a point where response headers were already sent. Traceback (most...
I ran pip install llama-cpp-python and the installation was a success, then I created a python file and copied over the example text in the readme. The only change I...