ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Question]: Use Chunking methods outside of Ragflow?

Open SaidKhudoyan opened this issue 1 year ago • 0 comments

Describe your problem

Is it possible to use the different chunking methods on already parsed files in one of my local directories? Similar to being able to use the parsing of documents, I would like to get the parsed documents, chunk them (e.g. with GraphRag or Law) and store them in an output-directory. Maybe via commandline (api_call?) like this:

python ragflow_parser.py --parsing_method Input_dir Output_dir

The chunks should have the doc_id and chunk_idx as filenames, each chunk is one txt or json file. Or alternatively one big json file for each document, looking e.g. like this

{parsing_method: "Law", doc_id: some_doc_id, doc_name: ExampleName, doc_summary: summary, chunks: {chunk_id: id1, text: chunk_text}, ......}

SaidKhudoyan avatar Oct 25 '24 18:10 SaidKhudoyan