dify icon indicating copy to clipboard operation
dify copied to clipboard

Metadata filter work in Chatbot but NOT work in workflow

Open gemchen opened this issue 9 months ago • 5 comments

Self Checks

  • [x] This is only for bug report, if you would like to ask a question, please head to Discussions.
  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [x] Please do not modify this template :) and fill in all the required fields.

Dify version

1.1.1

Cloud or Self Hosted

Self Hosted (Source)

Steps to reproduce

Created “Metadata Filter Conditions” in a Chatbot to ensure it recall documents with 'document_name' start with '<filter_word>'. while create same filter in workflow, it just ignore the filter and recall all documents with search input.

✔️ Expected Behavior

metadata filter works well as expect.

❌ Actual Behavior

I tried below criteria in the "workflow", the result is in the table.

contains : ❌ not contains: ❌ start with: ❌ end with: ❌ is: ❌ empty: ✅ not empty/before/after/≥/≤: not test

gemchen avatar Mar 21 '25 09:03 gemchen

Hello @gemchen! I'm here to help you with any bugs, questions, or contributions you have in mind. Let's tackle this issue together!

The issue you're experiencing with metadata filters not working in the workflow component, while they work in the Chatbot, might be related to how the filters are implemented or applied in the workflow component.

In the workflow component, metadata filters are applied in the file web/app/components/workflow/panel/version-history-panel/filter/index.tsx. The filtering logic is handled by the Filter component, which uses the useFilterOptions hook to retrieve filter options and applies them through the FilterItem component. The handleOnClick function is used to apply a selected filter option, and the FilterSwitch component is used to toggle the display of named versions [1].

Additionally, the code structure for handling metadata filters involves functions that define how to extract and filter file metadata based on specified keys and conditions. These functions use Python's match statement to determine the appropriate operation based on the provided key or condition [2].

Given that the filters work in the Chatbot but not in the workflow, it might be worth checking if there are any discrepancies in how the filters are configured or applied in the workflow component. You may also want to verify if there are any differences in the input data or conditions used in both components.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar Mar 21 '25 09:03 dosubot[bot]

@gemchen could pls provide more detail of this issue, such as some screen shot ? thanks

JohnJyong avatar Mar 21 '25 10:03 JohnJyong

Bad news metadata flow in both my chatbot and workflow are not work now.

Workflow case bad case

  1. I created a very simple workflow and set a filter with "[document_name] [start with] [Singapore]"
Image 2. I make a query, and you can find the output with document name NOT start with "Singapore" Image

Chatbot case

I created a chatbot with same kb and metadata filter and make a query. Expect reference documents are named start with "Singapore" but which is not.

Image

gemchen avatar Mar 21 '25 10:03 gemchen

version 1.1.1

Image

Image

stoplyy avatar Mar 21 '25 10:03 stoplyy

I noticed in the _retriever function in dataset_restrival.py link the parameter "metadata_condition" not used in the else branch of dify provider (maybe 'vender'), is this the cause?

Image

gemchen avatar Mar 21 '25 10:03 gemchen

Same Problem. I customized a metadata field: series. I want to filter out only the documents where the series field contains 'GS', but the retrieval output includes documents where the field is not 'GS'.

Also self hosted (source)

Image Image

rabbit1753 avatar Mar 24 '25 07:03 rabbit1753

thanks @rabbit1753 @gemchen @stoplyy for your reminder, we have fixed the metadata filter is not affect in keyword and full-text search , the pr has linked in the issue .

JohnJyong avatar Mar 24 '25 10:03 JohnJyong