llama.cpp Llama cpp low level python bindings

Background/rationale:

This pull request addresses #82 and #1156, bringing the low level python ctypes binding into llama.cpp. This should hopefully help reduce python binding fragmentation, and help broaden llama.cpp development. The use of python for examples and main wrappers is a pattern used in other related projects, such as rwkv.cpp and bert.cpp.

The ctypes python binding commits are from @abetlen / llama-python-cpp. Only the commits relevant for the low level bindings are included. Other commits such as hight level module or the server module are excluded. The remaining commits have been cleaned up some for clarity.

The python bindings can allow equivalent functionality of the bash scripts and main.cpp. Though the primary purpose is to get better alignment and widen the development community as python is a very common language in this field.

Having supported low level python bindings should not put any significant burden on c++ developers. As the python bindings become widely used, there will be many interested in keeping them up to date.

Use:

cmake -D BUILD_SHARED_LIBS=ON .

Chat.py is roughly equivalent to chat-13B.sh

MODEL=./models/llama-7B/ggml-model.bin python3 examples/Chat.py

low_level_api_chat_cpp.py is similar in functionality to main.cpp.

python3 examples/low_level_api_chat_cpp.py --model ./models/llama-7B/ggml-model.bin -b 1024 -i -r "User:" -f prompts/chat-with-bob.txt

low_level_api_chat_llama.py is simplified chat example.

Jun 01 '23 06:06 dmahurin

tabs replaced and trailing spaces removed in all commits (forced push) to pass the editor check

Jun 01 '23 13:06 dmahurin

Having supported low level python bindings should not put any significant burden on c++ developers. As the python bindings become widely used, there will be many interested in keeping them up to date.

Conversely that will also mean that a lot of people will be angry if you do something that breaks the Python bindings though.

Jun 01 '23 14:06 JohannesGaessler

Not sure about this - I see the positives, but I'm worried that it will be too difficult for me to maintain Python code Maybe at some later stage we can provide this API, but at the moment it will be a big burden. Open to suggestions though

Also, I get the impression that the llama-cpp-python project is in a pretty good shape and well maintained. I guess people can use that? Is there anything we can do to support it from llama.cpp side?

Jun 10 '23 07:06 ggerganov

llama.cpp llama.cpp copied to clipboard

Llama cpp low level python bindings

llama.cpp
llama.cpp copied to clipboard