graphrag icon indicating copy to clipboard operation
graphrag copied to clipboard

Optimized _build_text_unit_context function for improved time and spa…

Open arjun-234 opened this issue 1 year ago • 2 comments

…ce complexity

Refactored the _build_text_unit_context function to enhance its performance and efficiency. Key optimizations include:

  1. Set for Text Unit IDs: Replaced list-based membership checks with a set (text_unit_ids_set) to achieve constant-time complexity for membership checks, reducing overall time complexity.
  2. Direct Attribute Removal: Utilized pop with a default value (None) to directly remove attributes entity_order and num_relationships from text units, minimizing overhead and avoiding potential KeyError.
  3. Default Dictionary for Entity Orders: Implemented defaultdict for managing entity orders, simplifying the ranking process and improving readability.

These improvements result in a more efficient function with better performance, especially when handling large datasets or numerous selected entities. The refactoring ensures that the core functionality remains unchanged while enhancing both time and space complexity.

Description

[Provide a brief description of the changes made in this pull request.]

Related Issues

[Reference any related issues or tasks that this pull request addresses.]

Proposed Changes

[List the specific changes made in this pull request.]

Checklist

  • [ ] I have tested these changes locally.
  • [ ] I have reviewed the code changes.
  • [ ] I have updated the documentation (if necessary).
  • [ ] I have added appropriate unit tests (if applicable).

Additional Notes

[Add any additional notes or context that may be helpful for the reviewer(s).]

arjun-234 avatar Jul 07 '24 06:07 arjun-234

@microsoft-github-policy-service agree

arjun-234 avatar Jul 07 '24 06:07 arjun-234

Hi @arjun-234 Thanks for this PR! I will clone it locally to review the changes and perform some tests. Please be on the look out for any update on my end.

AlonsoGuevara avatar Jul 09 '24 18:07 AlonsoGuevara