[Feature Request]: Mapping output to metadata and filtering by metadata

Open EthanHuang0404 opened this issue 1 year ago • 1 comments

Do you need to file an issue?

[X] I have searched the existing issues and this feature is not already filed.
[ ] My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
[X] I believe this is a legitimate feature request, not just a question. If this is a question, please use the Discussions area.

Is your feature request related to a problem? Please describe.

I am working on a retrieval task using a regulation dataset stored in a text file, where each regulation is structured as {rule_name}, {section}, {content}, and {type} (either "external" or "internal"). The goal is to create a graph database that captures relationships between the metadata of each regulation, particularly the relationships between specific external regulations and their corresponding internal regulations, and vice versa.

To achieve this, I have split the original text file into smaller files, with each file representing an individual regulation. Each chunk retains the complete metadata for its respective regulation, and build the graph on it. The intended functionality is to input an external regulation (e.g., {rule_name}, {section}, {"type": "external regulation"}) and have the system return a list of related internal regulations, along with their metadata ({rule_name}, {section}, {"type": "internal regulation"}), and vice versa.

However, I am encountering issues where:

The system sometimes fails to map the output correctly to the original metadata.
The GraphRAG system does not currently differentiate between external and internal regulations. I would like to restrict the retrieved range to the specified type (e.g., when querying with an external regulation, only internal regulations should be retrieved, and vice versa).

I would appreciate any guidance on task structuring or adjustments that could improve the accuracy of metadata retrieval and ensure that the system distinguishes between different regulation types during the retrieval process.

Describe the solution you'd like

I would like the system to accurately response the metadata for regulations based on the input, ensuring that:

The system correctly maps the output to the original metadata (e.g., {rule_name}, {section}, {content}, {type}).
The retrieval process differentiates between external and internal regulations. Specifically, if I input an external regulation, the system should only return related internal regulations, and vice versa. This distinction should be clearly maintained within the GraphRAG system, allowing me to restrict the retrieval range to the specified type.

Additional context

No response

Sep 23 '24 09:09 EthanHuang0404

I also hope this feature can come out. For example, I sometimes I have a csv file where the first two columns recording "time" and "location", while the third column "event" record what happened. I hope we can have a multicolumn input so that all the nodes extracted from the third column can be connected to the first two columns. Thanks!

Oct 12 '24 19:10 YepJin