graphrag icon indicating copy to clipboard operation
graphrag copied to clipboard

[Feature Request]: Split at prepared deliminater instead of token splitting.

Open RobertHH-IS opened this issue 1 year ago • 3 comments

Is your feature request related to a problem? Please describe.

A big part of good rag is the quality of the input data. I would want to specifically prepare chunks with text and metadata for the graph extraction. A simple "delim" splitter would be a great addition opposed to the much more random character or token chunker.

Describe the solution you'd like

Allow us to specifiy delim in the chunks settings.yaml. If it is specified, it will not do any chunking, simply split at the delim and proceed from there.

Additional context

No response

RobertHH-IS avatar Jul 26 '24 14:07 RobertHH-IS