explicit cast messages to string for RAG purposes
Hi,
Ive been testing the openai-like-server with Ollama-webui and when using the rag pipeline, the code returns a list. Not a string and so I get this error from the llamacpp-python-server
File "~/miniconda3/envs/local_llm/lib/python3.11/site-packages/llama_cpp/llama_chat_format.py", line 170, in _format_chatml
ret += role + "\n" + message + sep + "\n"
~~~~~~~~~~~~^~~~~~~~~
TypeError: can only concatenate str (not "list") to str
Simply force casting messages to string before appending the chat format fixes this issue. If this is not the right way pls let me know what I can do to solve this issue.
Thanks
Hey @nivibilla can you provide a log of the messages that are sent that are causing this issue? The type hints should be correct there so any need to cast to str is likely another bug (probably in my code but just curious where it originates).
In my case:
Error occurred when executing LLavaSamplerSimple:
can only concatenate str (not "list") to str
File "C:\dev\ai\ComfyUI_windows_portable\ComfyUI\execution.py", line 152, in recursive_execute
output_data, output_ui = get_output_data(obj, input_data_all)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\ai\ComfyUI_windows_portable\ComfyUI\execution.py", line 82, in get_output_data
return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\ai\ComfyUI_windows_portable\ComfyUI\execution.py", line 75, in map_node_over_list
results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\ai\ComfyUI_windows_portable\ComfyUI\custom_nodes\ComfyUI_VLM_nodes\nodes\llavaloader.py", line 101, in generate_text
response = llm.create_chat_completion(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\ai\ComfyUI_windows_portable\python_embeded\Lib\site-packages\llama_cpp\llama.py", line 2109, in create_chat_completion
return handler(
^^^^^^^^
File "C:\dev\ai\ComfyUI_windows_portable\python_embeded\Lib\site-packages\llama_cpp\llama_chat_format.py", line 318, in basic_create_chat_completion
result = f(
^^
File "C:\dev\ai\ComfyUI_windows_portable\python_embeded\Lib\site-packages\llama_cpp\llama_chat_format.py", line 685, in format_chatml
_prompt = _format_chatml(system_message, _messages, _sep)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\dev\ai\ComfyUI_windows_portable\python_embeded\Lib\site-packages\llama_cpp\llama_chat_format.py", line 170, in _format_chatml
ret += role + "\n" + message + sep + "\n"
~~~~~~~~~~~~^~~~~~~~~
I have the same problem, I used followed command start serve
python3 -m llama_cpp.server --model /app/vlm_weights/MiniCPM-V-2_6-gguf/ggml-model-Q2_K.gguf --n_gpu_layers -1
then I used example clinet code with
from openai import OpenAI
client = OpenAI(base_url="http://localhost:8000/v1/", api_key="llama.cpp")
response = client.chat.completions.create(
model="gpt-4-vision-preview",
messages=[
{
"role": "user",
"content": [
{
"type": "image_url",
"image_url": {
"url": "https://user-images.githubusercontent.com/1991296/230134379-7181e485-c521-4d23-a0d6-f7b3b61ba524.png",
},
},
{
"type": "text",
"text": "What does the image say. Format your response as a json object with a single 'text' key.",
},
],
}
],
response_format={
"type": "json_object",
"schema": {"type": "object", "properties": {"text": {"type": "string"}}},
},
)
import json
print(json.loads(response.choices[0].message.content))
In my serve terminal I have
INFO: Started server process [526705]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://localhost:8000 (Press CTRL+C to quit)
Exception: can only concatenate str (not "list") to str
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/llama_cpp/server/errors.py", line 171, in custom_route_handler
response = await original_route_handler(request)
File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 301, in app
raw_response = await run_endpoint_function(
File "/usr/local/lib/python3.10/dist-packages/fastapi/routing.py", line 212, in run_endpoint_function
return await dependant.call(**values)
File "/usr/local/lib/python3.10/dist-packages/llama_cpp/server/app.py", line 513, in create_chat_completion
] = await run_in_threadpool(llama.create_chat_completion, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/starlette/concurrency.py", line 39, in run_in_threadpool
return await anyio.to_thread.run_sync(func, *args)
File "/usr/local/lib/python3.10/dist-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/dist-packages/anyio/_backends/_asyncio.py", line 859, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama.py", line 1898, in create_chat_completion
return handler(
File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_chat_format.py", line 564, in chat_completion_handler
result = chat_formatter(
File "/usr/local/lib/python3.10/dist-packages/llama_cpp/llama_chat_format.py", line 229, in __call__
prompt = self._environment.render(
File "/usr/local/lib/python3.10/dist-packages/jinja2/environment.py", line 1304, in render
self.environment.handle_exception()
File "/usr/local/lib/python3.10/dist-packages/jinja2/environment.py", line 939, in handle_exception
raise rewrite_traceback_stack(source=source)
File "<template>", line 4, in top-level template code
TypeError: can only concatenate str (not "list") to str