[BUG] - <AIMessage Content Missing Error and Citation Pipeline Validation Error While Using Local LLM (ollama)>
Description
Description: I am encountering two main issues while using the RAG (Retrieval-Augmented Generation) system with a local LLM (ollama) for document retrieval and question-answering. The system is set up to retrieve documents from a vector database using RAG, and I am experiencing the following errors:
TypeError: AIMessage.init() missing 1 required positional argument: 'content'
The error occurs during the process of generating and appending the AIMessage response. It seems like the content is not being passed correctly when constructing the AIMessage object. Example log:
Traceback (most recent call last): ... messages.append(AIMessage(content=ai)) File "/Users/zooyong/Documents/Kotaemon/libs/kotaemon/kotaemon/base/schema.py", line 63, in init super().init(*args, **kwargs) TypeError: AIMessage.init() missing 1 required positional argument: 'content' ValidationError in CitationPipeline:
The second issue is related to the citation generation. The system is expected to handle citation evidence, but it fails to parse the evidence list properly. Example log:
CitationPipeline: {"evidences":"["Greenville Park is a 1 acre park", "The park is located in South Carolina"]"} 1 validation error for CiteEvidence evidences Input should be a valid list [type=list_type, input_value='["Greenville Park is a 1...ted in South Carolina"]', input_type=str]
Reproduction steps
1.Set up a RAG system with a local LLM (ollama).
2.Run a document retrieval query.
3.Observe the error related to missing content in AIMessage.
4.Observe the validation error during the citation process.
Screenshots
No response
Logs
User-id: 1, can see public conversations: True
세션의 추론 유형 simple
세션의 LLM ollama
추론 클래스 <class 'ktem.reasoning.simple.FullQAPipeline'>
추론 상태 {'app': {'regen': False}, 'pipeline': {}}
추론중 ...
Retrievers [DocumentRetrievalPipeline(DS=<kotaemon.storages.docstores.lancedb.LanceDBDocumentStore object at 0x178bb82b0>, FSPath=PosixPath('/Users/zooyong/Documents/Kotaemon/ktem_app_data/user_data/files/index_1'), Index=<class 'ktem.index.file.index.IndexTable'>, Source=<class 'ktem.index.file.index.Source'>, VS=<kotaemon.storages.vectorstores.chroma.ChromaVectorStore object at 0x178bb8100>, get_extra_table=False, llm_scorer=LLMTrulensScoring(concurrent=True, normalize=10, prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x17fd41ae0>, system_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x17fd41840>, top_k=3, user_prompt_template=<kotaemon.llms.prompts.template.PromptTemplate object at 0x17fd428c0>), mmr=False, rerankers=[CohereReranking(cohere_api_key='', model_name='rerank-multilingual-v2.0', use_key_from_ktem=True)], retrieval_mode='vector', top_k=10, user_id=1), GraphRAGRetrieverPipeline(DS=<theflow.base.unset_ object at 0x1016aa0e0>, FSPath=<theflow.base.unset_ object at 0x1016aa0e0>, Index=<class 'ktem.index.file.index.IndexTable'>, Source=<theflow.base.unset_ object at 0x1016aa0e0>, VS=<theflow.base.unset_ object at 0x1016aa0e0>, file_ids=[], user_id=<theflow.base.unset_ object at 0x1016aa0e0>)]
searching in doc_ids ['019bb0c8-5700-44c8-841a-eaedf152c3fa', '0e21b98c-2e34-4846-9b46-bbcd83c8fdec', '11459e65-109a-4700-bd2e-dbb554c2a3d3', '17ae6d7f-e96a-4960-ad55-5d463fc3be0f', '1cd21c75-ea42-4502-a4f9-4700ce84859c', '4ee8b0b2-35d7-4f76-a593-3741a22fb4ab', '65be4a78-49be-4dda-a024-a54bbf1a6cc6', '77c5d4f3-9774-47c0-80e6-0d26e563bbef', '7a2c1a1c-d25a-4cce-98cd-a71a23cab5c0', '7b3f2537-0e6e-4635-a63d-d5272a154572', '7eba5fbf-d7b1-4f1b-a0df-10d5bb3a56dc', '84e8bbb0-5b54-4160-a1fa-cefe1c076a9d', '893987e5-422e-45b9-9384-3343c21ba872', '99cb8d4f-e0e8-4c62-8d0b-50c5c29ad2c6', 'a22b747d-a076-4afe-8150-6f87b7cd64f1', 'a505646e-f0a7-44ef-acd0-d4a2f942aa1c', 'a6b06055-f3d5-4bfc-893a-9f2c8062470d', 'b250bc33-c03b-4cc6-8d08-2adba65c7a20', 'b50b5ace-140d-4213-b9ab-b4e177a3f46c', 'b77704b5-02a7-48e7-8d74-86fe335c3b97', 'c0c4a52a-b893-4a7c-9c12-067395036d3c', 'c5e6ea07-aaea-4043-ae81-e836e50f1500', 'd2544aec-f8c7-4e37-beea-194f124e329b', 'e2f0f5ac-209b-49fd-ac2b-11ded87f3ca2', 'e9ddaab0-ec0a-48d3-ad81-1c2afe829211', 'ee71f36f-f051-4d50-b302-84f3e89cab0d']
retrieval_kwargs: dict_keys(['do_extend', 'scope', 'filters'])
Cannot get Cohere API key from `ktem` 'NoneType' object has no attribute '_kwargs'
Cohere API key not found. Skipping reranking.
Got raw 10 retrieved documents
thumbnail docs 0 non-thumbnail docs 10 raw-thumbnail docs 0
retrieval step took 0.3695368766784668
Document is not pdf
Document is not pdf
Document is not pdf
Document is not pdf
Document is not pdf
Document is not pdf
Document is not pdf
Document is not pdf
Document is not pdf
Document is not pdf
Got 10 retrieved documents
len (original) 32926
len (trimmed) 32926
Got 0 images
CitationPipeline: invoking LLM
Traceback (most recent call last):
File "/opt/miniconda3/envs/kotaemon/lib/python3.10/site-packages/gradio/queueing.py", line 575, in process_events
response = await route_utils.call_process_api(
File "/opt/miniconda3/envs/kotaemon/lib/python3.10/site-packages/gradio/route_utils.py", line 276, in call_process_api
output = await app.get_blocks().process_api(
File "/opt/miniconda3/envs/kotaemon/lib/python3.10/site-packages/gradio/blocks.py", line 1923, in process_api
result = await self.call_function(
File "/opt/miniconda3/envs/kotaemon/lib/python3.10/site-packages/gradio/blocks.py", line 1520, in call_function
prediction = await utils.async_iteration(iterator)
File "/opt/miniconda3/envs/kotaemon/lib/python3.10/site-packages/gradio/utils.py", line 663, in async_iteration
return await iterator.__anext__()
File "/opt/miniconda3/envs/kotaemon/lib/python3.10/site-packages/gradio/utils.py", line 656, in __anext__
return await anyio.to_thread.run_sync(
File "/opt/miniconda3/envs/kotaemon/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync
return await get_async_backend().run_sync_in_worker_thread(
File "/opt/miniconda3/envs/kotaemon/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2405, in run_sync_in_worker_thread
return await future
File "/opt/miniconda3/envs/kotaemon/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 914, in run
result = context.run(func, *args)
File "/opt/miniconda3/envs/kotaemon/lib/python3.10/site-packages/gradio/utils.py", line 639, in run_sync_iterator_async
return next(iterator)
File "/opt/miniconda3/envs/kotaemon/lib/python3.10/site-packages/gradio/utils.py", line 801, in gen_wrapper
response = next(iterator)
File "/Users/zooyong/Documents/Kotaemon/libs/ktem/ktem/pages/chat/__init__.py", line 871, in chat_fn
for response in pipeline.stream(chat_input, conversation_id, chat_history):
File "/Users/zooyong/Documents/Kotaemon/libs/ktem/ktem/reasoning/simple.py", line 673, in stream
answer = yield from self.answering_pipeline.stream(
File "/Users/zooyong/Documents/Kotaemon/libs/ktem/ktem/reasoning/simple.py", line 349, in stream
messages.append(AIMessage(content=ai))
File "/Users/zooyong/Documents/Kotaemon/libs/kotaemon/kotaemon/base/schema.py", line 63, in __init__
super().__init__(*args, **kwargs)
TypeError: AIMessage.__init__() missing 1 required positional argument: 'content'
CitationPipeline: finish invoking LLM
CitationPipeline: {"evidences":"[\"Greenville Park is a 1 acre park\", \"The park is located in South Carolina\"]"}
1 validation error for CiteEvidence
evidences
Input should be a valid list [type=list_type, input_value='["Greenville Park is a 1...ted in South Carolina"]', input_type=str]
For further information visit https://errors.pydantic.dev/2.9/v/list_type
LLM rerank scores [0.7, 0.3, 0.2, 0.2, 0.2, 0.2, 0.1, 0.1, 0.0, 0.0]
Browsers
Chrome
OS
MacOS
Additional information
No response