dify Knowledge Retrieval with external knowledge base

Self Checks

[X] This is only for bug report, if you would like to ask a question, please head to Discussions.
[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] [FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
[X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.9.1

Cloud or Self Hosted

Cloud

Steps to reproduce

I successfully connect an external knowledge base with the new feature provided in 0.9.1.

I can also test the knowledge base, in the dedicated section and it works fine.

When you try to insert the "Knowledge Retrieval" in a workwflow, you get this error

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

Oct 10 '24 00:10 akkaz

To resolve the "'NoneType' object has no attribute 'provider'" error when inserting "Knowledge Retrieval" in a workflow using Dify version 0.9.1, ensure that the reranking_model is properly initialized before accessing its provider attribute. This error likely occurs in the following section:

if node_data.multiple_retrieval_config.reranking_mode == "reranking_model":
    reranking_model = {
        "reranking_provider_name": node_data.multiple_retrieval_config.reranking_model.provider,
        "reranking_model_name": node_data.multiple_retrieval_config.reranking_model.model,
    }

Make sure that node_data.multiple_retrieval_config.reranking_model is not None and is correctly set up with the necessary attributes (provider and model). You might need to add a check or initialize it properly before this code block.

Additionally, review the _convert_to_knowledge_retrieval_node function in the workflow_converter.py file to ensure that the multiple_retrieval_config and its reranking_model are set up correctly, especially when the retrieval mode is MULTIPLE ^[1]^[2].

To continue talking to Dosu, mention @dosu.

Oct 10 '24 00:10 dosubot[bot]

@akkaz This error may be caused by not successfully configuring the Rerank Model but turning on the switch. We are working on fixing this issue. This feature can still be used normally. You can choose to temporarily turn off the rerank model switch, or configure an available rerank model.

Oct 11 '24 08:10 Yawen-1010

@akkaz This error may be caused by not successfully configuring the Rerank Model but turning on the switch. We are working on fixing this issue. This feature can still be used normally. You can choose to temporarily turn off the rerank model switch, or configure an available rerank model.

In my case, the issue occurred because the knowledge retrieval returned an empty value, despite the status being a successful 200. Upon reviewing the code, I identified a potential problem here: https://github.com/langgenius/dify/blob/f4ce08211d3c614f6fbffb47d55e3a96678fb78d/api/core/workflow/nodes/knowledge_retrieval/knowledge_retrieval_node.py#L82. After commenting out that line, the issue was resolved.

Additionally, the documentation mentions that metadata should be a string (https://docs.dify.ai/zh-hans/guides/knowledge-base/external-knowledge-api-documentation), but in practice, the backend treats it as a dictionary (https://github.com/langgenius/dify/blob/3f1aa1f9e263f1134ba0a89c61a5b0dd14eac0ac/api/core/rag/models/document.py#L18). These inconsistencies didn't raise any errors during recall tests, but in actual workflows, they caused failures with empty data being returned.

After fixing these two points, everything is functioning correctly. However, regarding the first issue, simply commenting out that line might introduce other problems, so a more robust solution may be needed.

Oct 11 '24 09:10 BGFGB

@JohnJyong

Oct 11 '24 09:10 Yawen-1010

For me also, enabling the rerank fixes the error, but always gives empty results.

Oct 11 '24 09:10 sepa85

the cloud servcie has updated to the latest version ,pls try , thanks ~

Oct 11 '24 13:10 JohnJyong

I still have empty result, even if knowledge test gives results. I'm using cloud.

Oct 12 '24 00:10 sepa85

the cloud servcie has updated to the latest version ,pls try , thanks ~

"I see that your fix has been merged into the main branch, but the cloud service is using Version 0.9.1-fix1, and it seems that the fix hasn't been integrated yet."

Oct 12 '24 01:10 BGFGB

@JohnJyong I locally merged your fix branch into version 0.9.1, and so far the issue appears to be resolved. Thank you for the fix!

Oct 12 '24 02:10 BGFGB

After updating to version 0.9.2, I'm still encountering issues with the workflow.

Now, whenever I click "Run," I receive a "Rerank model is required" error, regardless of whether rerank is enabled or disabled.

Screenshot_20241014-231240~2.jpg

Screenshot_20241014-231235~2.jpg

Oct 14 '24 21:10 sepa85

Hello, thank you for bringing this issue to our attention. After reviewing it, it seems to be a frontend saving bug. For now, please try returning to the studio and then selecting the app. This should prevent the 'rerank model is required' issue from appearing. We'll address this bug in the next release. Thank you for your patience!

Oct 15 '24 02:10 YIXIAO0

Going to studio and back to workflow seems to "fix" the front end issue, but the result array is still empty.

Oct 15 '24 15:10 sepa85