llama-cpp-python explicit cast messages to string for RAG purposes

Hi,

Ive been testing the openai-like-server with Ollama-webui and when using the rag pipeline, the code returns a list. Not a string and so I get this error from the llamacpp-python-server

  File "~/miniconda3/envs/local_llm/lib/python3.11/site-packages/llama_cpp/llama_chat_format.py", line 170, in _format_chatml
    ret += role + "\n" + message + sep + "\n"
           ~~~~~~~~~~~~^~~~~~~~~
TypeError: can only concatenate str (not "list") to str

Simply force casting messages to string before appending the chat format fixes this issue. If this is not the right way pls let me know what I can do to solve this issue.

Thanks

Jan 11 '24 21:01 nivibilla

Hey @nivibilla can you provide a log of the messages that are sent that are causing this issue? The type hints should be correct there so any need to cast to str is likely another bug (probably in my code but just curious where it originates).

Jan 22 '24 16:01 abetlen

In my case:

Error occurred when executing LLavaSamplerSimple:

can only concatenate str (not "list") to str

File "C:\dev\ai\ComfyUI_windows_portable\ComfyUI\execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\ai\ComfyUI_windows_portable\ComfyUI\execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\ai\ComfyUI_windows_portable\ComfyUI\execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\ai\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_VLM_nodes\nodes\llavaloader.py", line 101, in generate_text
response = llm.create_chat_completion(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\ai\ComfyUI_windows_portable\python_embeded\Lib\site-packages\llama_cpp\llama.py", line 2109, in create_chat_completion
return handler(
^^^^^^^^
File "C:\dev\ai\ComfyUI_windows_portable\python_embeded\Lib\site-packages\llama_cpp\llama_chat_format.py", line 318, in basic_create_chat_completion
result = f(
^^
File "C:\dev\ai\ComfyUI_windows_portable\python_embeded\Lib\site-packages\llama_cpp\llama_chat_format.py", line 685, in format_chatml
_prompt = _format_chatml(system_message, _messages, _sep)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\ai\ComfyUI_windows_portable\python_embeded\Lib\site-packages\llama_cpp\llama_chat_format.py", line 170, in _format_chatml
ret += role + "\n" + message + sep + "\n"
~~~~~~~~~~~~^~~~~~~~~

Aug 10 '24 07:08 theNailz

I have the same problem, I used followed command start serve

python3 -m llama_cpp.server --model /app/vlm_weights/MiniCPM-V-2_6-gguf/ggml-model-Q2_K.gguf --n_gpu_layers -1

then I used example clinet code with

from openai import OpenAI

client = OpenAI(base_url="http://localhost:8000/v1/", api_key="llama.cpp")
response = client.chat.completions.create(
    model="gpt-4-vision-preview",
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://user-images.githubusercontent.com/1991296/230134379-7181e485-c521-4d23-a0d6-f7b3b61ba524.png",
                    },
                },
                {
                    "type": "text",
                    "text": "What does the image say. Format your response as a json object with a single 'text' key.",
                },
            ],
        }
    ],
    response_format={
        "type": "json_object",
        "schema": {"type": "object", "properties": {"text": {"type": "string"}}},
    },
)
import json

print(json.loads(response.choices[0].message.content))

In my serve terminal I have

INFO:     Started server process [526705]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://localhost:8000 (Press CTRL+C to quit)
Exception: can only concatenate str (not "list") to str
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/server/errors.py", line 171, in custom_route_handler
    response = await original_route_handler(request)
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 301, in app
    raw_response = await run_endpoint_function(
  File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 212, in run_endpoint_function
    return await dependant.call(**values)
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/server/app.py", line 513, in create_chat_completion
    ] = await run_in_threadpool(llama.create_chat_completion, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/starlette/concurrency.py", line 39, in run_in_threadpool
    return await anyio.to_thread.run_sync(func, *args)
  File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 859, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py", line 1898, in create_chat_completion
    return handler(
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_chat_format.py", line 564, in chat_completion_handler
    result = chat_formatter(
  File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_chat_format.py", line 229, in __call__
    prompt = self._environment.render(
  File "/usr/local/lib/python3.10/dist-packages/jinja2/environment.py", line 1304, in render
    self.environment.handle_exception()
  File "/usr/local/lib/python3.10/dist-packages/jinja2/environment.py", line 939, in handle_exception
    raise rewrite_traceback_stack(source=source)
  File "<template>", line 4, in top-level template code
TypeError: can only concatenate str (not "list") to str

Sep 19 '24 03:09 PredyDaddy