llm-graph-builder icon indicating copy to clipboard operation
llm-graph-builder copied to clipboard

Use ollama and deepseek-r1:1.5b can't generate graph

Open qpb8023 opened this issue 9 months ago • 15 comments

The error occurs in the langchain_experimental/graph_transformers/llm.py aprocess_response function. How to fix it ?

-----
{'head': '作者', 'head_type': '相关人物', 'relation': {'涉及人物': {'描述': '与相关的人物角色有关'}}, 'tail': '研究成果', 'tail_type': '成果'}
-----
2025-02-24 17:26:41,801 - Deleted File Path: /Users/tiutiutiu/Desktop/llm-graph-builder/backend/merged_files/《教育心理学》.pdf and Deleted File Name : 《教育心理学》.pdf
[ERROR]{'api_name': 'extract', 'message': 'Failed To Process File:《教育心理学》.pdf or LLM Unable To Parse Content ', 'file_created_at': '2025-02-24 16:21:15 ', 'error_message': "1 validation error for Relationship\ntype\n  Input should be a valid string [type=string_type, input_value={'涉及人物': {'描述. input_type=dict]\n    For further information visit https://errors.pydantic.dev/2.9/v/string_type", 'file_name': '《教育心理学》.pdf', 'status': 'Failed', 'db_url': 'neo4j://192.168.6.6:7688', 'userName': 'neo4j', 'database': 'neo4j', 'failed_count': 1, 'source_type': 'local file', 'source_url': None, 'wiy': None, 'logging_time': '2025-02-24 09:26:41 UTC', 'email': None}
2025-02-24 17:26:41,809 - File Failed in extraction: 1 validation error for Relationship
type
  Input should be a valid string [type=string_type, input_value={'涉及人物': {'描述...的人物角色有关'}}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.9/v/string_type
Traceback (most recent call last):
  File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/score.py", line 222, in extract_knowledge_graph_from_file
    uri_latency, result = await extract_graph_from_file_local_file(uri, userName, password, database, model, merged_file_path, file_name, allowedNodes, allowedRelationship, token_chunk_size, chunk_overlap, chunks_to_combine, retry_condition, additional_instructions)
  File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/src/main.py", line 238, in extract_graph_from_file_local_file
    return await processing_source(uri, userName, password, database, model, fileName, [], allowedNodes, allowedRelationship, token_chunk_size, chunk_overlap, chunks_to_combine, True, merged_file_path, retry_condition, additional_instructions=additional_instructions)
  File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/src/main.py", line 383, in processing_source
    node_count,rel_count,latency_processed_chunk = await processing_chunks(selected_chunks,graph,uri, userName, password, database,file_name,model,allowedNodes,allowedRelationship,chunks_to_combine,node_count, rel_count, additional_instructions)
  File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/src/main.py", line 478, in processing_chunks
    graph_documents =  await get_graph_from_llm(model, chunkId_chunkDoc_list, allowedNodes, allowedRelationship, chunks_to_combine, additional_instructions)
  File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/src/llm.py", line 225, in get_graph_from_llm
    graph_document_list = await get_graph_document_list(
  File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/src/llm.py", line 208, in get_graph_document_list
    graph_document_list = await llm_transformer.aconvert_to_graph_documents(combined_chunk_document_list)
  File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/venv/lib/python3.10/site-packages/langchain_experimental/graph_transformers/llm.py", line 1034, in aconvert_to_graph_documents
    results = await asyncio.gather(*tasks)
  File "/Users/tiutiutiu/.pyenv/versions/3.10.16/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup
    future.result()
  File "/Users/tiutiutiu/.pyenv/versions/3.10.16/lib/python3.10/asyncio/tasks.py", line 232, in __step
    result = coro.send(None)
  File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/venv/lib/python3.10/site-packages/langchain_experimental/graph_transformers/llm.py", line 977, in aprocess_response
    Relationship(
  File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/venv/lib/python3.10/site-packages/langchain_core/load/serializable.py", line 125, in __init__
    super().__init__(*args, **kwargs)
  File "/Users/tiutiutiu/Desktop/llm-graph-builder/backend/venv/lib/python3.10/site-packages/pydantic/main.py", line 212, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 1 validation error for Relationship
type
  Input should be a valid string [type=string_type, input_value={'涉及人物': {'描述...的人物角色有关'}}, input_type=dict]
    For further information visit https://errors.pydantic.dev/2.9/v/string_type

qpb8023 avatar Feb 24 '25 09:02 qpb8023

Similar issue using locally host Ollama. [ERROR]{'api_name': 'extract', 'message': 'Failed To Process File:A_Compact_GuideTo_RAG.pdf or LLM Unable To Parse Content ', 'file_created_at': '2025-02-28 09:18:43 ', 'error_message': "Ollama call failed with status code 400. Details: <bound method ClientResponse.text of <ClientResponse(http://localhost:11434/api/chat) [400 Bad Request]>\n<CIMultiDictProxy('Content-Type': 'application/json; charset=utf-8', 'Date': 'Fri, 28 Feb 2025 17:28:28 GMT', 'Content-Length': '29')>\n>", 'file_name': 'A_Compact_GuideTo_RAG.pdf', 'status': 'Failed', 'db_url': 'neo4j://localhost:7687', 'userName': 'neo4j', 'database': 'neo4j', 'failed_count': 1, 'source_typINFO: 127.0.0.1:51052 - "GET /health HTTP/1.1" 200 OK. The graph is created with Chuck nodes and after that it is not able to feed back the LLM for further breaking down.

seemakurthy avatar Feb 28 '25 20:02 seemakurthy

Currently, the DeepSeek LLM model is not producing structured output as expected when used with LLM Graph Builder. We are actively investigating this issue and working on a resolution. We appreciate your patience and will provide updates as we make progress.

kaustubh-darekar avatar Mar 06 '25 13:03 kaustubh-darekar

@kaustubh-darekar has the problem already solved?looking forward your reply.

jinshuaiwang avatar Mar 31 '25 09:03 jinshuaiwang

Hi @jinshuaiwang , we havent found a solution yet but you can try https://python.langchain.com/docs/integrations/chat/deepseek/ (If you have deepseek api key) as mentioned there you can keep model ="deepseek-chat" for structured output.

let us know if it works.

kaustubh-darekar avatar Apr 03 '25 06:04 kaustubh-darekar

@kaustubh-darekar I run ollama locally,use ChatOllama and OllamaEmbeddings, when run 'docs = doc_retriever.invoke({"messages": messages}, {"callbacks": [handler]})' this line,Error retrieving documents: {code: Neo.ClientError.Statement.ParameterMissing} {message: Expected parameter(s): embedding} raise this error,what‘s the problem,how can i solve it?

jinshuaiwang avatar Apr 08 '25 07:04 jinshuaiwang

Hi @jinshuaiwang

Check langchain_neo4j version you are using.

with latest version embeddings are changed to query_vector So we need to change that in all queries used in constants.py ( You can refer dev branch constants.py )

kaustubh-darekar avatar Apr 08 '25 15:04 kaustubh-darekar

@kaustubh-darekar Thanks a lot, I will try

jinshuaiwang avatar Apr 09 '25 07:04 jinshuaiwang

@jinshuaiwang Let me know if you have any updates. I'm currently trying to run the app locally with Ollama, but I'm running into some issues. If you manage to get it working or already have a working setup, would you mind sharing your .env files for both the frontend and backend?

I'm a bit confused because there are several different answers in past discussions, and it's not entirely clear how the LLM_MODEL_CONFIG_... should be structured with Ollama.

Thanks a lot in advance!

damiannqn90 avatar Apr 09 '25 15:04 damiannqn90

@damiannqn90 I am debugging,please wait a while,once all is ok,I will paste my .env content. @kaustubh-darekar when I merge dev branch,the problem solved,but when extract the upload file occur below exception: 2025-04-10 17:48:29,014 - root - ERROR - File Failed in extraction: 'str' object has no attribute 'get' Traceback (most recent call last): File "/code/score.py", line 244, in extract_knowledge_graph_from_file uri_latency, result = await extract_graph_from_file_local_file(uri, userName, password, database, model, merged_file_path, file_name, allowedNodes, allowedRelationship, token_chunk_size, chunk_overlap, chunks_to_combine, retry_condition, additional_instructions) File "/code/src/main.py", line 257, in extract_graph_from_file_local_file return await processing_source(uri, userName, password, database, model, file_name, pages, allowedNodes, allowedRelationship, token_chunk_size, chunk_overlap, chunks_to_combine, True, merged_file_path, additional_instructions=additional_instructions) File "/code/src/main.py", line 404, in processing_source node_count,rel_count,latency_processed_chunk = await processing_chunks(selected_chunks,graph,uri, userName, password, database,file_name,model,allowedNodes,allowedRelationship,chunks_to_combine,node_count, rel_count, additional_instructions) File "/code/src/main.py", line 499, in processing_chunks graph_documents = await get_graph_from_llm(model, chunkId_chunkDoc_list, allowedNodes, allowedRelationship, chunks_to_combine, additional_instructions) File "/code/src/llm.py", line 220, in get_graph_from_llm graph_document_list = await get_graph_document_list( File "/code/src/llm.py", line 199, in get_graph_document_list graph_document_list = await llm_transformer.aconvert_to_graph_documents(combined_chunk_document_list) File "/usr/local/lib/python3.10/site-packages/langchain_experimental/graph_transformers/llm.py", line 1031, in aconvert_to_graph_documents results = await asyncio.gather(*tasks) File "/usr/local/lib/python3.10/asyncio/tasks.py", line 304, in __wakeup future.result() File "/usr/local/lib/python3.10/asyncio/tasks.py", line 232, in __step result = coro.send(None) File "/usr/local/lib/python3.10/site-packages/langchain_experimental/graph_transformers/llm.py", line 957, in aprocess_response not rel.get("head") AttributeError: 'str' object has no attribute 'get'

I tried to read the source code,found that I can‘t solve,what’s the problem?Thanks a lot in advance! @kaustubh-darekar @damiannqn90

jinshuaiwang avatar Apr 10 '25 10:04 jinshuaiwang

This is problem with LLMGraphTransformer for certain llms like deepseek which fails to give structured output in schema specified. We are yet to find solution on this.

But there is alternate option to use deepseek with structured output define llm using chatdeepseekhttps://python.langchain.com/docs/integrations/chat/deepseek/ (This require deepseek api key) use model = "deepseek-chat"

kaustubh-darekar avatar Apr 10 '25 10:04 kaustubh-darekar

@kaustubh-darekar O I know,Thank you very much,I want to use local deployed deepseek,looking forward your goods news.

jinshuaiwang avatar Apr 10 '25 15:04 jinshuaiwang

@jinshuaiwang Let me know if you have any updates. I'm currently trying to run the app locally with Ollama, but I'm running into some issues. If you manage to get it working or already have a working setup, would you mind sharing your .env files for both the frontend and backend?

I'm a bit confused because there are several different answers in past discussions, and it's not entirely clear how the LLM_MODEL_CONFIG_... should be structured with Ollama.

Thanks a lot in advance!

Hi @damiannqn90 we clearly mentioned here about how to LLM_MODEL_CONFIG variable for ollama models please try it out and let us know

kartikpersistent avatar Apr 11 '25 07:04 kartikpersistent

Hi @kaustubh-darekar, in my case i am trying with ollama too.

I'm trying to run the LLM Graph Builder application locally using the instructions provided in the README under "For local LLMs (Ollama)". I've edited both the frontend and backend .env files as described, but I'm unable to generate entities from a PDF or ask questions through the chat.

I suspect my .env configuration might be incorrect — especially the LLM_MODEL_CONFIG_... part, which seems to vary depending on the model and setup. I've seen several discussions about this, but it's still not very clear how it should be structured when using Ollama.

I’d really appreciate it if you could share working versions of both .env files (frontend and backend) that use Ollama successfully.

Thanks in advance!

damiannqn90 avatar Apr 11 '25 07:04 damiannqn90

Hi @kartikpersistent, thank you for your response! I still have a few doubts.

In another discussion, I saw this configuration for the backend .env file: LLM_MODEL_CONFIG_ollama_llama3="llama3.1, http://localhost:11434/" but it was later revised to: LLM_MODEL_CONFIG_ollama="llama3.1, http://localhost:11434/"

This left me a bit confused about how to properly configure the .env files for both the backend and frontend.

When I check the models installed on my local machine using ollama list, I get:

C:\Users\DataSpurs>ollama list

NAME ID SIZE MODIFIED

llama3:latest 365c0bd3c000 4.7 GB 2 days ago

So my question is: which format should I use in the .env file?

LLM_MODEL_CONFIG_ollama_llama3="llama3:latest, http://localhost:11434/"

LLM_MODEL_CONFIG_ollama="llama3:latest, http://localhost:11434/"

LLM_MODEL_CONFIG_ollama_llama3="llama3.1, http://localhost:11434/"

LLM_MODEL_CONFIG_ollama="llama3.1, http://localhost:11434/"

I’d really appreciate your clarification on this.

Thanks in advance!

damiannqn90 avatar Apr 11 '25 07:04 damiannqn90

Hi @damiannqn90 I have just tried the application with ollama llama3 model here is the configuration LLM_MODEL_CONFIG_ollama_llama3=${LLM_MODEL_CONFIG_ollama_llama3-llama3,http://host.docker.internal:11434} and I have not mentioned anything in VITE_LLM_MODELS=${VITE_LLM_MODELS-} all models will get rendered in the dropdown you can choose ollama llama3 and click on extract

Image

kartikpersistent avatar Apr 11 '25 15:04 kartikpersistent