graphrag
graphrag copied to clipboard
feat: adapt graphrag_import_neo4j for CSV input format
Description
Add support for CSV format input in graphrag_import_neo4j notebook. This change allows the notebook to process input data in CSV format.
Proposed Changes
- Modified column mapping in document import:
- Changed from using 'title' to 'text' column in create_final_documents.parquet
- Updated entity import configuration:
- Changed from using 'name' to 'title' column in create_final_entities.parquet
- Updated documentation to reflect CSV format support and configured the yaml file to use CSV format for input data
- Kept core import logic unchanged
Checklist
- [x] I have tested these changes locally.
- [x] I have reviewed the code changes.
Additional Notes
This change maintains backward compatibility while adding support for CSV format input. The core functionality remains unchanged, only the input column mappings have been modified to accommodate the CSV format.
@liyun11118 please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
@microsoft-github-policy-service agree [company="{your company}"]Options:
- (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
- (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"Contributor License Agreement
@microsoft-github-policy-service agree
@AlonsoGuevara Please take a look when possible. Thanks!