graphrag [Issue]: Prompt tuning slows down the indexing time

Is there an existing issue for this?

[X] I have searched the existing issues
[X] I have checked #657 to validate if my issue is covered by community support

Describe the issue

I tried using prompt tuning to create a more suitable prompt for my case. Afterward, I attempted to create the GraphRAG for 23 SEC filings, and the indexing time increased significantly. It now takes around 2:30 to 3 hours just for entity extraction, and it will likely take a similar amount of time for community reports as well. This is a considerable increase compared to the default prompt or minor adjustments to the entity extraction prompt, which only took about 8-12 minutes for the same steps. I also noticed that the community report prompt has changed slightly with prompt tuning. Also I am using gpt-4o-mini in all the cases, chunk size=600 and ovelap=100.

Has anyone encountered a similar issue? Is there an explanation for this significant difference in processing time?

Steps to reproduce

No response

GraphRAG Config Used

# Paste your config here

Logs and screenshots

No response

Additional Information

GraphRAG Version: 0.1.1
Operating System: Windows 10.0.22631
Python Version: 3.11.5

Jul 26 '24 11:07 kouskouss

Can you describe the differences in the prompt? Also, can you report on the number of extracted entities with the default prompt versus the tuned prompt? My first guess is that the prompt tuning is finding far more entities in your dataset, resulting in a lot more processing. If you can assess the quality of the extracted entities in either case, this may indicate whether the prompt tuning is actually working better and the processing time is an unfortunate side effect, or if it is not adding value for you.

Also note that increasing the chunk size will greatly speed up most tasks. We often run with different settings and compare to find the best balance of entity quantity versus performance for the domain.

Jul 26 '24 17:07 natoverse

Can you describe the differences in the prompt? Also, can you report on the number of extracted entities with the default prompt versus the tuned prompt? My first guess is that the prompt tuning is finding far more entities in your dataset, resulting in a lot more processing. If you can assess the quality of the extracted entities in either case, this may indicate whether the prompt tuning is actually working better and the processing time is an unfortunate side effect, or if it is not adding value for you.

Also note that increasing the chunk size will greatly speed up most tasks. We often run with different settings and compare to find the best balance of entity quantity versus performance for the domain.

The main difference in the prompt is the example that is being used, in the prompt tuning prompt is only 1 and it is much more specified/ complicated with numbers, percentages, and locations than the default prompt that 3 examples are used, which they consist of 1 or 2 sentences and simpler content.
Number of entities in prompt tuning: 13018 vs Number of entities in prompt engineering (change the prompt slightly with a bit more specified example): 41574
There is also a big difference in the number of relationships, where the prompt tuning produced much fewer relationships ( 207) than the graphrag with prompt engineering (12913).
As a result the performance of the graphrag with prompt tuning is worse than the graphrag with manually prompt engineering.
I am using the same chunking size: 600 in all the cases, as it worked fast with prompt engineering.

Jul 30 '24 09:07 kouskouss

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

Aug 07 '24 01:08 github-actions[bot]

Ok, thanks for the additional detail. There aren't any obvious reasons why the prompt would slow things down, other than entity/relationship counts - generally the more, the slower. Your results seem to indicate the opposite, that the fewer results from the auto-tune are also slower. 8-12 minutes sounds really fast for 41k entities. If you have already run your original prompts you might be getting the benefit of the cache - a good test would be to take a subset of your data (so you don't spend the time/money on that full re-run) and run each in a fresh environment with no cache.

One other comment: both of these seem like they have very few relationships relative to the entities. We normally see more, but that may be due to the nature of your data.

Aug 07 '24 23:08 natoverse

This issue has been marked stale due to inactivity after repo maintainer or community member responses that request more information or suggest a solution. It will be closed after five additional days.

Aug 16 '24 19:08 github-actions[bot]