[Issue]: <title> I resolve issues related to 'create_final_community_reports: community'.
Is there an existing issue for this?
- [ ] I have searched the existing issues
- [ ] I have checked #657 to validate if my issue is covered by community support
Describe the issue
Certainly! Here’s the translation of the provided text into English:
The code in graphrag-local-ollama/graphrag/index/graph/extractors/community_reports/community_reports_extractor.py is problematic. This code is intended to generate a summary report. The response is obtained by calling an asynchronous method _llm with the following parameters:
response = (
await self._llm(
self._extraction_prompt,
json=True,
name="create_community_report",
variables={self._input_text_key: inputs[self._input_text_key]},
is_response_valid=lambda x: dict_has_keys_with_types(
x,
[
("title", str),
("summary", str),
("findings", list),
("rating", float),
("rating_explanation", str),
],
),
model_parameters={"max_tokens": self._max_report_length},
)
)
This returns JSON data, and text_output is extracted using _get_text_output(output). The purpose here is to return CommunityReportsResult with structured_output as the JSON data and output as the string.
Therefore, it can be simplified to:
# Copyright (c) 2024 Microsoft Corporation.
# Licensed under the MIT License
"""A module containing 'CommunityReportsResult' and 'CommunityReportsExtractor' models."""
import json
import logging
import traceback
from dataclasses import dataclass
from typing import Any
from graphrag.index.typing import ErrorHandlerFn
from graphrag.index.utils import dict_has_keys_with_types
from graphrag.llm import CompletionLLM
from .prompts import COMMUNITY_REPORT_PROMPT
log = logging.getLogger(__name__)
@dataclass
class CommunityReportsResult:
"""Community reports result class definition."""
output: str
structured_output: dict
class CommunityReportsExtractor:
"""Community reports extractor class definition."""
_llm: CompletionLLM
_input_text_key: str
_extraction_prompt: str
_output_formatter_prompt: str
_on_error: ErrorHandlerFn
_max_report_length: int
def __init__(
self,
llm_invoker: CompletionLLM,
input_text_key: str | None = None,
extraction_prompt: str | None = None,
on_error: ErrorHandlerFn | None = None,
max_report_length: int | None = None,
):
"""Init method definition."""
self._llm = llm_invoker
self._input_text_key = input_text_key or "input_text"
self._extraction_prompt = extraction_prompt or COMMUNITY_REPORT_PROMPT
self._on_error = on_error or (lambda _e, _s, _d: None)
self._max_report_length = max_report_length or 1500
async def __call__(self, inputs: dict[str, Any]):
"""Call method definition."""
file_path = "/kaggle/working/typestructured.txt"
# Open file and read JSON data
with open(file_path, "r") as f:
output = json.load(f)
return CommunityReportsResult(
structured_output=output,
output="""# Village of Hana Ri
The Village of Hana Ri, named after its most notable resident, is a close-knit community centered around family and tradition. The key entities within this village include Hana Ri himself, his family members such as his uncle '三叔' and younger sister, various villagers, artisans, and even a blacksmith. The relationships between these entities are deeply intertwined, with significant events like the departure of Hana Ri's parents affecting him profoundly.
## Hana Ri, the primary character, lives in the village and is one of the children rarely heard by his real name.
Hana Ri, also known as '二愣子' by '老张叔', is a resident of the Village of Hana Ri. He is one of the children in the village who seldom hears his real name called [Data: Characters (1)]. This suggests that Hana Ri may have a unique or unconventional role within the community.
## Hana Ri's relationship with his family is significant, particularly with his younger sister.
Hana Ri has a close bond with his younger sister. He intends to earn money to return home and thinks about picking more red jujubes for her when he goes to the mountain [Data: Characters (1), Relationships (+more)]. This indicates that Hana Ri values his family and is willing to make sacrifices for them.
## Hana Ri admires his uncle, the blacksmith, and aspires to become an apprentice under him.
Hana Ri greatly admires his uncle, the blacksmith. He wants to become an apprentice for the artisan master [Data: Characters (1), Relationships (+more)]. This shows that Hana Ri respects and looks up to skilled individuals within the community.
## The departure of Hana Ri's parents has a profound impact on him.
Hana Ri is affected by his parents leaving him [Data: Characters (1), Events (+more)]. This event likely shapes Hana Ri's perspective and motivations within the community.
## The Village of Hana Ri has a rich tradition of jujube cultivation.
Hana Ri thinks about picking more red jujubes for his sister when he goes to the mountain [Data: Characters (1), Items (+more)]. This suggests that jujubes are an important part of the village's culture and economy.""",
)
def _get_text_output(self, parsed_output: dict) -> str:
title = parsed_output.get("title", "Report")
summary = parsed_output.get("summary", "")
findings = parsed_output.get("findings", [])
def finding_summary(finding: dict):
if isinstance(finding, str):
return finding
return finding.get("summary")
def finding_explanation(finding: dict):
if isinstance(finding, str):
return ""
return finding.get("explanation")
report_sections = "\n\n".join(
f"## {finding_summary(f)}\n\n{finding_explanation(f)}" for f in findings
)
return f"# {title}\n\n{summary}\n\n{report_sections}"
Below is the link to my Kaggle code, which is based on the graphrag-local-ollama code. I have made some modifications:: https://www.kaggle.com/code/xipeig/graphrag-ollama
是这个graphrag-local-ollama/graphrag/index/graph/extractors/community_reports/community_reports_extractor.py代码出了问题,这个代码的功能是概述报告,response = ( await self._llm( self._extraction_prompt, json=True, name="create_community_report", variables={self._input_text_key: inputs[self._input_text_key]}, is_response_valid=lambda x: dict_has_keys_with_types( x, [ ("title", str), ("summary", str), ("findings", list), ("rating", float), ("rating_explanation", str), ], ), model_parameters={"max_tokens": self._max_report_length}, )
这个返回一个json数据,text_output = self._get_text_output(output)这个提取出summary,title等等.作用在这里return CommunityReportsResult( structured_output=output,#json数据 output=text_output,#str )
所以可以直接写成
Copyright (c) 2024 Microsoft Corporation.
Licensed under the MIT License
"""A module containing 'CommunityReportsResult' and 'CommunityReportsExtractor' models.""" import json import logging import traceback from dataclasses import dataclass from typing import Any
from graphrag.index.typing import ErrorHandlerFn from graphrag.index.utils import dict_has_keys_with_types from graphrag.llm import CompletionLLM
from .prompts import COMMUNITY_REPORT_PROMPT
log = logging.getLogger(name)
@dataclass class CommunityReportsResult: """Community reports result class definition."""
output: str
structured_output: dict
class CommunityReportsExtractor: """Community reports extractor class definition."""
_llm: CompletionLLM
_input_text_key: str
_extraction_prompt: str
_output_formatter_prompt: str
_on_error: ErrorHandlerFn
_max_report_length: int
def __init__(
self,
llm_invoker: CompletionLLM,
input_text_key: str | None = None,
extraction_prompt: str | None = None,
on_error: ErrorHandlerFn | None = None,
max_report_length: int | None = None,
):
"""Init method definition."""
self._llm = llm_invoker
self._input_text_key = input_text_key or "input_text"
self._extraction_prompt = extraction_prompt or COMMUNITY_REPORT_PROMPT
self._on_error = on_error or (lambda _e, _s, _d: None)
self._max_report_length = max_report_length or 1500
async def __call__(self, inputs: dict[str, Any]):
"""Call method definition."""
# output = None
# try:
# response = (
# await self._llm(
# self._extraction_prompt,
# json=True,
# name="create_community_report",
# variables={self._input_text_key: inputs[self._input_text_key]},
# is_response_valid=lambda x: dict_has_keys_with_types(
# x,
# [
# ("title", str),
# ("summary", str),
# ("findings", list),
# ("rating", float),
# ("rating_explanation", str),
# ],
# ),
# model_parameters={"max_tokens": self._max_report_length},
# )
# or {}
# )
# output = response.json or {}
# except Exception as e:
# log.exception("error generating community report")
# self._on_error(e, traceback.format_exc(), None)
# output = {}
# try:
# with open("/kaggle/working/typestructured_output.txt", "w") as f:
# json.dump(output, f, indent=4)
# except Exception as e:
# with open("/kaggle/working/output.txt", "w") as f:
# f.write(str(e))
# text_output = self._get_text_output(output)
# try:
# with open("/kaggle/working/typeoutput.txt", "w") as f:
# f.write(text_output)
# except Exception as e:
# with open("/kaggle/working/output1.txt", "w") as f:
# f.write(str(e))
file_path = "/kaggle/working/typestructured.txt"
# 打开文件并读取 JSON 数据
with open(file_path, "r") as f:
output = json.load(f)
return CommunityReportsResult(
structured_output=output,
output="""# Village of Hana Ri
The Village of Hana Ri, named after its most notable resident, is a close-knit community centered around family and tradition. The key entities within this village include Hana Ri himself, his family members such as his uncle '三叔' and younger sister, various villagers, artisans, and even a blacksmith. The relationships between these entities are deeply intertwined, with significant events like the departure of Hana Ri's parents affecting him profoundly.
Hana Ri, the primary character, lives in the village and is one of the children rarely heard by his real name.
Hana Ri, also known as '二愣子' by '老张叔', is a resident of the Village of Hana Ri. He is one of the children in the village who seldom hears his real name called [Data: Characters (1)]. This suggests that Hana Ri may have a unique or unconventional role within the community.
Hana Ri's relationship with his family is significant, particularly with his younger sister.
Hana Ri has a close bond with his younger sister. He intends to earn money to return home and thinks about picking more red jujubes for her when he goes to the mountain [Data: Characters (1), Relationships (+more)]. This indicates that Hana Ri values his family and is willing to make sacrifices for them.
Hana Ri admires his uncle, the blacksmith, and aspires to become an apprentice under him.
Hana Ri greatly admires his uncle, the blacksmith. He wants to become an apprentice for the artisan master [Data: Characters (1), Relationships (+more)]. This shows that Hana Ri respects and looks up to skilled individuals within the community.
The departure of Hana Ri's parents has a profound impact on him.
Hana Ri is affected by his parents leaving him [Data: Characters (1), Events (+more)]. This event likely shapes Hana Ri's perspective and motivations within the community.
The Village of Hana Ri has a rich tradition of jujube cultivation.
Hana Ri thinks about picking more red jujubes for his sister when he goes to the mountain [Data: Characters (1), Items (+more)]. This suggests that jujubes are an important part of the village's culture and economy.""", )
def _get_text_output(self, parsed_output: dict) -> str:
title = parsed_output.get("title", "Report")
summary = parsed_output.get("summary", "")
findings = parsed_output.get("findings", [])
def finding_summary(finding: dict):
if isinstance(finding, str):
return finding
return finding.get("summary")
def finding_explanation(finding: dict):
if isinstance(finding, str):
return ""
return finding.get("explanation")
report_sections = "\n\n".join(
f"## {finding_summary(f)}\n\n{finding_explanation(f)}" for f in findings
)
return f"# {title}\n\n{summary}\n\n{report_sections}"
下面是我的kaggle代码链接,是graphrag-local-ollama代码,我做了一点修改: https://www.kaggle.com/code/xipeig/graphrag-ollama
Steps to reproduce
No response
GraphRAG Config Used
# Paste your config here
Logs and screenshots
No response
Additional Information
- GraphRAG Version:
- Operating System:
- Python Version:
- Related Issues:
能不能直接贴中文,到底改哪里,让我抄抄作业行不,看源码太累了
can you open a pr for this?
can you open a pr for this?
直接重新启动ollama serve就工作了
can you open a pr for this?
You can directly restart the Ollama serve and it will work.
We have resolved several issues related to text encoding and JSON parsing that are rolled up into version 0.2.2. We believe this issue is resolved as part of this release.
We have resolved several issues related to text encoding and JSON parsing that are rolled up into version 0.2.2. We believe this issue is resolved as part of this release. It works. Thank you.