ragflow icon indicating copy to clipboard operation
ragflow copied to clipboard

[Bug]: meta_fieldsHow to view all the field names contained in meta fields through python code。

Open lhxxrds opened this issue 1 month ago • 6 comments

Self Checks

  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (Language Policy).
  • [x] Non-english title submitions will be closed directly ( 非英文标题的提交将会被直接关闭 ) (Language Policy).
  • [x] Please do not modify this template :) and fill in all the required fields.

RAGFlow workspace code commit ID

567

RAGFlow image version

v0.22.1

Other environment information

meta_fieldsHow to view all the field names contained in meta fields through python code。

Actual behavior

meta_fieldsHow to view all the field names contained in meta fields through python code。

Expected behavior

meta_fieldsHow to view all the field names contained in meta fields through python code。

Steps to reproduce

meta_fieldsHow to view all the field names contained in meta fields through python code。

Additional information

meta_fieldsHow to view all the field names contained in meta fields through python code。

lhxxrds avatar Nov 25 '25 08:11 lhxxrds

You can view all the field names contained in meta fields for a document in RAGFlow by accessing the meta_fields attribute, which is a dictionary. For a single document, use:

field_names = list(doc.meta_fields.keys())
print(field_names)

To get all meta field names across a dataset, use the SDK's get_meta_by_kbs method:

meta_info = rag.get_meta_by_kbs([dataset_id])
print(meta_info.keys())  # Shows all meta field names present in the dataset

Meta fields must be set in each document's meta_fields to be visible and filterable. If you batch upload documents, update their metadata individually to ensure the fields are present. For more details, see the Document model definition and SDK usage examples.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other

Chat with Dosu Open in Cursor Join Discord Share on X

dosubot[bot] avatar Nov 25 '25 08:11 dosubot[bot]

{'doc_name': '2024发电机组计划检修.txt'}I have already obtained the corresponding metadata using the above method. Could you please provide a complete Python code example to implement metadata filtering—specifically, retrieving all documents where the value of the doc_name key contains "Shanxi Province"? Please write only the full Python code.on @dosu.

lhxxrds avatar Nov 25 '25 09:11 lhxxrds

from ragflow_sdk import RAGFlow

# Initialize the RAGFlow client
rag = RAGFlow(api_key="YOUR_API_KEY", base_url="http://YOUR_BASE_URL:9380")

# Get your dataset (replace with your actual dataset name or ID)
datasets = rag.list_datasets(name="YOUR_DATASET_NAME")
dataset = datasets[0]

# Build the metadata_condition for substring filtering on the 'doc_name' field
metadata_condition = {
    "logic": "and",
    "conditions": [
        {
            "name": "doc_name",  # must match the key in meta_fields
            "comparison_operator": "contains",
            "value": "Shanxi Province"
        }
    ]
}

# Retrieve document chunks matching the filter
results = rag.retrieve(
    question="",  # or your search query
    dataset_ids=[dataset.id],
    metadata_condition=metadata_condition
)

for chunk in results:
    print(chunk.document_name, chunk.content)

# Note: The 'doc_name' field must be present in each document's meta_fields for filtering to work.

This code works with RAGFlow v0.22.1 and later. If you batch uploaded documents, make sure each document's meta_fields includes the 'doc_name' key for filtering to be effective.
Reference: metadata filtering example

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other

Chat with Dosu Open in Cursor Join Discord Share on X

dosubot[bot] avatar Nov 25 '25 09:11 dosubot[bot]

Is my understanding of keyword filtering correct? For instance, when I use document names as metadata for filtering, during the retrieval process, the system will only search from documents that meet the metadata conditions, and documents that do not meet the conditions will not participate in subsequent vector or keyword matching at all. The final returned result only contains the document content that matches the metadata.n @dosu.

lhxxrds avatar Nov 25 '25 09:11 lhxxrds

Yes, your understanding is correct for RAGFlow v0.22.1: when you use metadata filtering (for example, filtering by document name), only documents whose meta_fields match the filter participate in retrieval. Documents that do not meet the metadata conditions are excluded from all subsequent vector or keyword matching, and the final results only contain content from the filtered set. This pre-filtering is implemented at the start of the retrieval process, so it narrows the search space before any semantic or keyword search happens (source, source, source, source).

A few caveats: filtering only works if the metadata field is present in every document's meta_fields, and the field name in your filter must exactly match the key in meta_fields. Batch uploads do not set meta_fields by default, so you may need to update them individually after upload. Also, passing explicit document_ids disables metadata filtering.

If you need to debug, check your meta_fields and filter structure, and make sure you're not filtering on chunk-level properties that aren't included in meta_fields. If you follow these guidelines, metadata filtering will work as you described.

To reply, just mention @dosu.


How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other

Chat with Dosu Open in Cursor Join Discord Share on X

dosubot[bot] avatar Nov 25 '25 09:11 dosubot[bot]

May I ask when the next version will be released approximately and how long it will take for the above-mentioned metadata filtering to be fixed. @ n @dosu.

lhxxrds avatar Nov 26 '25 01:11 lhxxrds

you want a http api to achieve this?

Magicbook1108 avatar Nov 28 '25 03:11 Magicbook1108