llama-cpp-python issues

Roadmap for v0.2

1

- [x] #489 - [x] #490 - [x] #491 - [x] #488 - [x] #492 - [x] #494 - [x] #493 - [ ] #771 - [x] #675

abetlen

documentation

enhancement

Update API to match latest llama.cpp version

2

This PR updates llama_cpp.py so that it matches the [llama.h](https://github.com/ggml-org/llama.cpp/blob/master/include/llama.h) API changes introduced in the commits: https://github.com/ggml-org/llama.cpp/commit/e0dbec0bc6cd4b6230cda7a6ed1e9dac08d1600b https://github.com/ggml-org/llama.cpp/commit/8fcb563613e20a04dd9791f0a9b8a41086428c09 https://github.com/ggml-org/llama.cpp/commit/00d53800e00bb22a26bf710fa6bd1150e412cc1d https://github.com/ggml-org/llama.cpp/commit/dd373dd3bf81eced3e711fb7cb49123a6105933e https://github.com/ggml-org/llama.cpp/commit/b3de7cac732e7dc0e10e1bb07502a500b0ee9022 https://github.com/ggml-org/llama.cpp/commit/2c3f8b850a4a6cff0f5dda2135c03fc81d33ed8b https://github.com/ggml-org/llama.cpp/commit/e0e912f49b3195ef9d0c51378629ba03c9b972da I couldn't find any example on how...

mamei16

Wrong `chat_format` documented for Moondream 2

Hi! 👋 I was testing the Server API with Moondream 2 as the model, and the [documented `chat_format`](https://github.com/abetlen/llama-cpp-python/blob/37eb5f0a4c2a8706b89ead1406b1577c4602cdec/README.md#L504), `moondream2`, threw an error: ```plain Error code: 500 - {'error': {'message': "Invalid...

joaopalmeiro

Models loading on CPU instead of GPU after updating version

4

I updated my version because some DeepSeek models were not working loading; after updating, they started loading, but only on CPU. I tried with other older models on my system...

KuraiAI

Incorrect winmode in load_shared_library

In the method load_shared_library the winmode specification is incorrect. The actual code is: ``` cdll_args["winmode"] = ctypes.RTLD_GLOBAL ``` ctypes.RTLD_GLOBAL is not a value for winmode but only for mode argument....

LucaBenini

MiniCPM-V 2.6 memory leak occurred !!!

5

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [ ] I am running the latest code. Development is very rapid so there are no...

Liwx1014