ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Question]: Define meta data for each chunks

Open erikguo opened this issue 1 year ago • 3 comments

Describe your problem

Thank you for your excellent work!

We wanna use this Ragflow as our major knowledge base. Our documents have several pattern of structures. Each part in the structure has coherent contents. So we decide to split documents to chunks according to the structure. How can we mark meta data in each chunk? we only find keywords attribute of chunks. This isn't suitable for marking meta data.

erikguo avatar Jul 21 '24 14:07 erikguo

Meta data to chunk has not been supported yet. You could contact us by [email protected]

KevinHuSh avatar Jul 22 '24 01:07 KevinHuSh

Thank you for your quick reply.

Another question: where do we get the sequence and page no of each chunk in the source document? We didn't found in the code and information in the chunk.

erikguo avatar Jul 22 '24 03:07 erikguo

It's stored in ES about fields of position.

KevinHuSh avatar Aug 01 '24 01:08 KevinHuSh

This feature is quite important in many situations. We use the metadata fields of our vector database to store information we don’t want embedded. At the moment, Ragflow seems to expose only the content and important_keywords fields, and both are used during the search process.

If you have any suggestions for this scenario, please let me know.

Thanks!

Tuanshu avatar Jul 04 '25 10:07 Tuanshu