ragflow
ragflow copied to clipboard
[Question]: Use Chunking methods outside of Ragflow?
Describe your problem
Is it possible to use the different chunking methods on already parsed files in one of my local directories? Similar to being able to use the parsing of documents, I would like to get the parsed documents, chunk them (e.g. with GraphRag or Law) and store them in an output-directory. Maybe via commandline (api_call?) like this:
python ragflow_parser.py --parsing_method Input_dir Output_dir
The chunks should have the doc_id and chunk_idx as filenames, each chunk is one txt or json file. Or alternatively one big json file for each document, looking e.g. like this
{parsing_method: "Law", doc_id: some_doc_id, doc_name: ExampleName, doc_summary: summary, chunks: {chunk_id: id1, text: chunk_text}, ......}