llm-graph-builder
llm-graph-builder copied to clipboard
Use ollama and deepseek-r1:1.5b can't generate graph
The error occurs in the langchain_experimental/graph_transformers/llm.py aprocess_response function.
How to fix it ?
-----
{'head': '作者', 'head_type': '相关人物', 'relation': {'涉及人物': {'描述': '与相关的人物角色有关'}}, 'tail': '研究成果', 'tail_type': '成果'}
-----
2025-02-24 17:26:41,801 - Deleted File Path: /Users/tiutiutiu/Desktop/llm-graph-builder/backend/merged_files/《教育心理学》.pdf and Deleted File Name : 《教育心理学》.pdf
[ERROR]{'api_name': 'extract', 'message': 'Failed To Process File:《教育心理学》.pdf or LLM Unable To Parse Content ', 'file_created_at': '2025-02-24 16:21:15 ', 'error_message': "1 validation error for Relationship\ntype\n Input should be a valid string [type=string_type, input_value={'涉及人物': {'描述. input_type=dict]\n For further information visit https://errors.pydantic.dev/2.9/v/string_type", 'file_name': '《教育心理学》.pdf', 'status': 'Failed', 'db_url': 'neo4j://192.168.6.6:7688', 'userName': 'neo4j', 'database': 'neo4j', 'failed_count': 1, 'source_type': 'local file', 'source_url': None, 'wiy': None, 'logging_time': '2025-02-24 09:26:41 UTC', 'email': None}
2025-02-24 17:26:41,809 - File Failed in extraction: 1 validation error for Relationship
type
Input should be a valid string [type=string_type, input_value={'涉及人物': {'描述...的人物角色有关'}}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/string_type
Traceback (most recent call last):
File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/score.py", line 222, in extract_knowledge_graph_from_file
uri_latency, result = await extract_graph_from_file_local_file(uri, userName, password, database, model, merged_file_path, file_name, allowedNodes, allowedRelationship, token_chunk_size, chunk_overlap, chunks_to_combine, retry_condition, additional_instructions)
File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/src/main.py", line 238, in extract_graph_from_file_local_file
return await processing_source(uri, userName, password, database, model, fileName, [], allowedNodes, allowedRelationship, token_chunk_size, chunk_overlap, chunks_to_combine, True, merged_file_path, retry_condition, additional_instructions=additional_instructions)
File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/src/main.py", line 383, in processing_source
node_count,rel_count,latency_processed_chunk = await processing_chunks(selected_chunks,graph,uri, userName, password, database,file_name,model,allowedNodes,allowedRelationship,chunks_to_combine,node_count, rel_count, additional_instructions)
File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/src/main.py", line 478, in processing_chunks
graph_documents = await get_graph_from_llm(model, chunkId_chunkDoc_list, allowedNodes, allowedRelationship, chunks_to_combine, additional_instructions)
File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/src/llm.py", line 225, in get_graph_from_llm
graph_document_list = await get_graph_document_list(
File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/src/llm.py", line 208, in get_graph_document_list
graph_document_list = await llm_transformer.aconvert_to_graph_documents(combined_chunk_document_list)
File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/venv/lib/python3.10/site-packages/langchain_experimental/graph_transformers/llm.py", line 1034, in aconvert_to_graph_documents
results = await asyncio.gather(*tasks)
File "/Users/tiutiutiu/.pyenv/versions/3.10.16/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup
future.result()
File "/Users/tiutiutiu/.pyenv/versions/3.10.16/lib/python3.10/asyncio/tasks.py", line 232, in __step
result = coro.send(None)
File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/venv/lib/python3.10/site-packages/langchain_experimental/graph_transformers/llm.py", line 977, in aprocess_response
Relationship(
File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/venv/lib/python3.10/site-packages/langchain_core/load/serializable.py", line 125, in __init__
super().__init__(*args, **kwargs)
File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/venv/lib/python3.10/site-packages/pydantic/main.py", line 212, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Relationship
type
Input should be a valid string [type=string_type, input_value={'涉及人物': {'描述...的人物角色有关'}}, input_type=dict]
For further information visit https://errors.pydantic.dev/2.9/v/string_type
Similar issue using locally host Ollama. [ERROR]{'api_name': 'extract', 'message': 'Failed To Process File:A_Compact_GuideTo_RAG.pdf or LLM Unable To Parse Content ', 'file_created_at': '2025-02-28 09:18:43 ', 'error_message': "Ollama call failed with status code 400. Details: <bound method ClientResponse.text of <ClientResponse(http://localhost:11434/api/chat) [400 Bad Request]>\n<CIMultiDictProxy('Content-Type': 'application/json; charset=utf-8', 'Date': 'Fri, 28 Feb 2025 17:28:28 GMT', 'Content-Length': '29')>\n>", 'file_name': 'A_Compact_GuideTo_RAG.pdf', 'status': 'Failed', 'db_url': 'neo4j://localhost:7687', 'userName': 'neo4j', 'database': 'neo4j', 'failed_count': 1, 'source_typINFO: 127.0.0.1:51052 - "GET /health HTTP/1.1" 200 OK. The graph is created with Chuck nodes and after that it is not able to feed back the LLM for further breaking down.
Currently, the DeepSeek LLM model is not producing structured output as expected when used with LLM Graph Builder. We are actively investigating this issue and working on a resolution. We appreciate your patience and will provide updates as we make progress.
@kaustubh-darekar has the problem already solved?looking forward your reply.
Hi @jinshuaiwang , we havent found a solution yet but you can try https://python.langchain.com/docs/integrations/chat/deepseek/ (If you have deepseek api key) as mentioned there you can keep model ="deepseek-chat" for structured output.
let us know if it works.
@kaustubh-darekar I run ollama locally,use ChatOllama and OllamaEmbeddings, when run 'docs = doc_retriever.invoke({"messages": messages}, {"callbacks": [handler]})' this line,Error retrieving documents: {code: Neo.ClientError.Statement.ParameterMissing} {message: Expected parameter(s): embedding} raise this error,what‘s the problem,how can i solve it?
Hi @jinshuaiwang
Check langchain_neo4j version you are using.
with latest version embeddings are changed to query_vector So we need to change that in all queries used in constants.py ( You can refer dev branch constants.py )
@kaustubh-darekar Thanks a lot, I will try
@jinshuaiwang Let me know if you have any updates. I'm currently trying to run the app locally with Ollama, but I'm running into some issues. If you manage to get it working or already have a working setup, would you mind sharing your .env files for both the frontend and backend?
I'm a bit confused because there are several different answers in past discussions, and it's not entirely clear how the LLM_MODEL_CONFIG_... should be structured with Ollama.
Thanks a lot in advance!
@damiannqn90 I am debugging,please wait a while,once all is ok,I will paste my .env content. @kaustubh-darekar when I merge dev branch,the problem solved,but when extract the upload file occur below exception: 2025-04-10 17:48:29,014 - root - ERROR - File Failed in extraction: 'str' object has no attribute 'get' Traceback (most recent call last): File "/code/score.py", line 244, in extract_knowledge_graph_from_file uri_latency, result = await extract_graph_from_file_local_file(uri, userName, password, database, model, merged_file_path, file_name, allowedNodes, allowedRelationship, token_chunk_size, chunk_overlap, chunks_to_combine, retry_condition, additional_instructions) File "/code/src/main.py", line 257, in extract_graph_from_file_local_file return await processing_source(uri, userName, password, database, model, file_name, pages, allowedNodes, allowedRelationship, token_chunk_size, chunk_overlap, chunks_to_combine, True, merged_file_path, additional_instructions=additional_instructions) File "/code/src/main.py", line 404, in processing_source node_count,rel_count,latency_processed_chunk = await processing_chunks(selected_chunks,graph,uri, userName, password, database,file_name,model,allowedNodes,allowedRelationship,chunks_to_combine,node_count, rel_count, additional_instructions) File "/code/src/main.py", line 499, in processing_chunks graph_documents = await get_graph_from_llm(model, chunkId_chunkDoc_list, allowedNodes, allowedRelationship, chunks_to_combine, additional_instructions) File "/code/src/llm.py", line 220, in get_graph_from_llm graph_document_list = await get_graph_document_list( File "/code/src/llm.py", line 199, in get_graph_document_list graph_document_list = await llm_transformer.aconvert_to_graph_documents(combined_chunk_document_list) File "/usr/local/lib/python3.10/site-packages/langchain_experimental/graph_transformers/llm.py", line 1031, in aconvert_to_graph_documents results = await asyncio.gather(*tasks) File "/usr/local/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup future.result() File "/usr/local/lib/python3.10/asyncio/tasks.py", line 232, in __step result = coro.send(None) File "/usr/local/lib/python3.10/site-packages/langchain_experimental/graph_transformers/llm.py", line 957, in aprocess_response not rel.get("head") AttributeError: 'str' object has no attribute 'get'
I tried to read the source code,found that I can‘t solve,what’s the problem?Thanks a lot in advance! @kaustubh-darekar @damiannqn90
This is problem with LLMGraphTransformer for certain llms like deepseek which fails to give structured output in schema specified. We are yet to find solution on this.
But there is alternate option to use deepseek with structured output define llm using chatdeepseekhttps://python.langchain.com/docs/integrations/chat/deepseek/ (This require deepseek api key) use model = "deepseek-chat"
@kaustubh-darekar O I know,Thank you very much,I want to use local deployed deepseek,looking forward your goods news.
@jinshuaiwang Let me know if you have any updates. I'm currently trying to run the app locally with Ollama, but I'm running into some issues. If you manage to get it working or already have a working setup, would you mind sharing your .env files for both the frontend and backend?
I'm a bit confused because there are several different answers in past discussions, and it's not entirely clear how the LLM_MODEL_CONFIG_... should be structured with Ollama.
Thanks a lot in advance!
Hi @damiannqn90 we clearly mentioned here about how to LLM_MODEL_CONFIG variable for ollama models please try it out and let us know
Hi @kaustubh-darekar, in my case i am trying with ollama too.
I'm trying to run the LLM Graph Builder application locally using the instructions provided in the README under "For local LLMs (Ollama)". I've edited both the frontend and backend .env files as described, but I'm unable to generate entities from a PDF or ask questions through the chat.
I suspect my .env configuration might be incorrect — especially the LLM_MODEL_CONFIG_... part, which seems to vary depending on the model and setup. I've seen several discussions about this, but it's still not very clear how it should be structured when using Ollama.
I’d really appreciate it if you could share working versions of both .env files (frontend and backend) that use Ollama successfully.
Thanks in advance!
Hi @kartikpersistent, thank you for your response! I still have a few doubts.
In another discussion, I saw this configuration for the backend .env file: LLM_MODEL_CONFIG_ollama_llama3="llama3.1, http://localhost:11434/" but it was later revised to: LLM_MODEL_CONFIG_ollama="llama3.1, http://localhost:11434/"
This left me a bit confused about how to properly configure the .env files for both the backend and frontend.
When I check the models installed on my local machine using ollama list, I get:
C:\Users\DataSpurs>ollama list
NAME ID SIZE MODIFIED
llama3:latest 365c0bd3c000 4.7 GB 2 days ago
So my question is: which format should I use in the .env file?
LLM_MODEL_CONFIG_ollama_llama3="llama3:latest, http://localhost:11434/"
LLM_MODEL_CONFIG_ollama="llama3:latest, http://localhost:11434/"
LLM_MODEL_CONFIG_ollama_llama3="llama3.1, http://localhost:11434/"
LLM_MODEL_CONFIG_ollama="llama3.1, http://localhost:11434/"
I’d really appreciate your clarification on this.
Thanks in advance!
Hi @damiannqn90 I have just tried the application with ollama llama3 model
here is the configuration
LLM_MODEL_CONFIG_ollama_llama3=${LLM_MODEL_CONFIG_ollama_llama3-llama3,http://host.docker.internal:11434}
and I have not mentioned anything in
VITE_LLM_MODELS=${VITE_LLM_MODELS-}
all models will get rendered in the dropdown you can choose ollama llama3 and click on extract