ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Question]: How does metadata work?

Open HaotianJiang056 opened this issue 11 months ago • 6 comments

Self Checks

  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (Language Policy).
  • [x] Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • [x] Please do not modify this template :) and fill in all the required fields.

Describe your problem

When I asked information about metadata of provided citations, I found that the models didn't get metadata. So how to use metadata?

HaotianJiang056 avatar Mar 28 '25 01:03 HaotianJiang056

The metadata of a document will be shown to LLM whenever the inner chunks are retrieved as the relevant information to the given question. Click the little lamp or setup Langfuse, you could trace the content sent to LLM.

KevinHuSh avatar Mar 28 '25 02:03 KevinHuSh

The metadata of a document will be shown to LLM whenever the inner chunks are retrieved as the relevant information to the given question. Click the little lamp or setup Langfuse, you could trace the content sent to LLM.

@KevinHuSh Sorry to bother, but I found LLM dosen't get metadata. Here are my steps: First, I add metadata to the knowledge base:

Image

Then I made an agent:

Image

Image

But it reply with this:

Image

HaotianJiang056 avatar Mar 28 '25 09:03 HaotianJiang056

The metadata of a document will be shown to LLM whenever the inner chunks are retrieved as the relevant information to the given question. Click the little lamp or setup Langfuse, you could trace the content sent to LLM.

Now I realise the metadata won't be shown to Agent LLM.

HaotianJiang056 avatar Mar 28 '25 09:03 HaotianJiang056

It does also quite not work in chat properly... The metadata is shown on the dialog with the light bulb but the LLM seems not to recognize it.

Snify89 avatar Mar 31 '25 05:03 Snify89

Related to this, is there a way to properly filter on metadata?

If I'm interpreting the documentation correctly, when a file has relevant content and is retrieved it will also show the metadata to further enhance the context of the document but is it possible to check the metadata before retrieval?

Say that we only want to include files of a certain author in our response, is there any way to tell the system to only retrieve documents where we have the "author" metadata set to the specified author? Or does it just come down to prompt engineering to try and ignore certain files?

EeckhoutJens avatar Apr 03 '25 13:04 EeckhoutJens

Related to this, is there a way to properly filter on metadata?

If I'm interpreting the documentation correctly, when a file has relevant content and is retrieved it will also show the metadata to further enhance the context of the document but is it possible to check the metadata before retrieval?

Say that we only want to include files of a certain author in our response, is there any way to tell the system to only retrieve documents where we have the "author" metadata set to the specified author? Or does it just come down to prompt engineering to try and ignore certain files?

I thought of this as well. An example for metadata would be like "paid/unpaid" for an invoice. It would be great to filter/specify with these metadata as well.

Snify89 avatar Apr 03 '25 14:04 Snify89