[Feature Request]: Support for asyncronous Indexing with OpenAI batch APIs

Open shyshin opened this issue 1 year ago • 0 comments

Do you need to file an issue?

[X] I have searched the existing issues and this feature is not already filed.
[X] My model is hosted on OpenAI or Azure. If not, please look at the "model providers" issue and don't file a new one here.
[X] I believe this is a legitimate feature request, not just a question. If this is a question, please use the Discussions area.

Is your feature request related to a problem? Please describe.

Multiple articles including Graph RAG Costs Explained indicate that GraphRAG could be expensive especially when building a knowledge graph.

Every GraphRAG system involves 3 steps:

Indexing: Building the graph
Retrieval: Retrieving information from the graph
Generation: Generating response from the LLM

Though Retrieval and Generation are required to be sychronous as they are end user facing, our indexing operation could be asynchronous.

Describe the solution you'd like

OpenAI provides a batch API which takes in the requests in bulk. User can then retrieve the result of this batch after 24 hours. As these APIs are 50% less cheaper than the current APIs, these can be leveraged for the asynchronous graph indexing operation which can reduce the cost significantly. The result can be stored in the cache directory so that the current flow is not significantly altered.

Additional context

No response

Aug 29 '24 11:08 shyshin