llama-cpp-chat-completion-wrapper
llama-cpp-chat-completion-wrapper copied to clipboard
Wrapper around llama-cpp-python for chat completion with LLaMA v2 models.
LLaMA v2 Chat Completion Wrapper
Handles chat completion message format to use with llama-cpp-python. The code is basically the same as here (Meta original code).
NOTE: It's still not identical to the result of the Meta code. More about that here.
Update: I added an option to use the original Meta tokenizer encoder in order to get the correct result. See the example.py
file along the USE_META_TOKENIZER_ENCODER
flag.
Installation
Developed using python 3.10
on windows.
pip install -r requirements.txt
Usage
Check example.py
file.
Streamlit
First install streamlit
pip install streamlit
Then, run the file streamlit_app.py
with:
streamlit run streamlit_app.py