The knowledge base retrieval speed was noticeably slower
Self Checks
- [x] This is only for bug report, if you would like to ask a question, please head to Discussions.
- [x] I have searched for existing issues search for existing issues, including closed ones.
- [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [x] Please do not modify this template :) and fill in all the required fields.
Dify version
1.4.1
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
Upgrading from version 1.3.1 to 1.4.1, the knowledge base retrieval speed becomes significantly slower when asking the same question with the same knowledge base content.
✔️ Expected Behavior
No response
❌ Actual Behavior
No response
+1
+1
@nutriver @593652385 change vector_store to milvus done
database index may have been lost. It should be checked. For example, "child_chunk_dataset_id_idx".
+1 same problem
I can also confirm this and 1.4.3 is the worst!
I had been using 1.3.0 for a while and in order to use qwen 3 I upgraded to 1.4.3 and I immediately noticed the performance of Knowledge Retrieval downgraded significantly.
Here is an example, for a very simple retrieval (just 2 files in the knowledge base), it took 6.79 seconds!
Today I upgraded to 1.5.0, the performance improved a bit (for the same question) to 980 ms, but it is still disappointing.
To verify it was introduced in 1.4.x I use a clean machine to install 1.3.0 again, the performance is the best, 62 ms (as it should be)
BTW, I have verified that the retrieval ratio between 1.3.0 and 1.5.0 is ~~almost always 1:10 ( around 60 ms to 600~800 ms)~~. If I asked the same question in 1.3.0 twice, the second time of the retrieval time is always be around 60 ms (the first time maybe about 150 ms), but for 1.5.0 the second time of the retrieval time won't change much.
Due to IPR, I can only attache this one question screenshots.
mark ,1.5.0 the same
+1 This issue also occurs in version 1.4.2 and has not been fixed.
I notice 1.5.1 has a new feature "Knowledge Basis Indexing: Introduced KB indexing by @Gevtolev in https://github.com/langgenius/dify/pull/20868. It optimizes the access and retrieval speeds for your data treasures."
So I upgrade to 1.5.1 but I did not notice the difference, at least the difference is not obvious (I also create new knowledge to do comparison)
I've fixed this issue through the following steps, as detailed in this issue: https://github.com/langgenius/dify/issues/21549.
I upgraded the Weaviate version in the Docker container to weaviate-client~=3.26.7, rebuilt the Docker image with this change, and then started the container using the updated image. This resolved the issue.
我修复了这个问题,通过这个 issue: https://github.com/langgenius/dify/issues/21549 我修改了 docker 容器中的 weaviate 版本为 weaviate-client~=3.26.7,然后将该容器制作为镜像,而后用该镜像启动即可修复该问题。
@nolynn Were you suggesting weaviate vector database is the reason ?
@nolynn Were you suggesting weaviate vector database is the reason ? 你是说我们的病毒数据库是原因吗?
The issue lies in the Weaviate connection driver. Some versions of the driver perform an update check. For details, see: https://github.com/langgenius/dify/issues/20972.
I found modifying the source code to be cumbersome, so I simply updated the driver version instead.
问题在于 Weaviate 的连接驱动上,部分版本的驱动做了更新检查,详情见:https://github.com/langgenius/dify/issues/20972 ,我觉得修改源码麻烦,就修改了驱动的版本。
Okay I also find https://github.com/langgenius/dify/issues/20972 and modify the code directly
/app/api/.venv/lib/python3.12/site-packages/weaviate/connect/connection.py , comment out pkg_info = requests.get(PYPI_PACKAGE_URL, timeout=PYPI_TIMEOUT).json()
~~restart the docker~~ (should not restart the docker), but I still don't see the difference.
1.5.1 the same @crazywoola would you be able to help us?
I've fixed this issue through the following steps, as detailed in this issue: #21549.
I upgraded the Weaviate version in the Docker container to weaviate-client~=3.26.7, rebuilt the Docker image with this change, and then started the container using the updated image. This resolved the issue.
我修复了这个问题,通过这个 issue: #21549 我修改了 docker 容器中的 weaviate 版本为 weaviate-client~=3.26.7,然后将该容器制作为镜像,而后用该镜像启动即可修复该问题。
My dify version is 1.5.1, and after the upgrade, I indeed found that the vector database retrieval has become slower. I solved it using the method provided by @nolynn.
you can try it. @hiwuye @qiulang
Could you explain in detail how to operate the Weaviate version weaviate-client~=3.26.7 in the Docker container, then create an image from that container, and finally start with that image to fix the issue? How to upgrade? @dj-jack001 @nolynn
@XiaoCC @dj-jack001 @nolynn
I updated weaviate-client~=3.26.7 in the Docker container directly and I did not see the search performance improvement. So I doubt rebuilting the Docker image with that will make a difference.
I updated weaviate-client~=3.26.7 in the Docker container directly and I did not see the search performance improvement. So I doubt rebuilting the Docker image with that will make a difference.
@qiulang @XiaoCC My dify version is 1.5.1, and upgrading weaviate-client to ~3.26.7 does not require rebuilding the image or restarting the container; the container can keep running. This is because the search slowdown issue was caused by a request to pypi before the search in versions of weaviate-client below 3.26.7. For more details, you can refer to the following links:
- #21549
- Pull request: Revert weaviate-client version to speed up knowledge store search #20693
@dj-jack001 you did not see my point and I did exactly what you did, update weaviate-client in the docker directly.
I did not see the search performance improve
@qiulang Oh, I saw your content above, sorry, I didn't notice it just now, I don't know what the reason is for this part, for me, completing the search within 1 second is enough
@dj-jack001
I did update weaviate-client in both docker-api and docker-worker. To update them, because they are the Poetry created virtual environment without pip, so I do it this way
/app/api/.venv/bin/python -m ensurepip --upgrade
/app/api/.venv/bin/pip3 install weaviate-client==3.26.7 -i https://mirrors.aliyun.com/pypi/simple
// then confirm
root@24a2600514c6:/app/api# /app/api/.venv/bin/pip3 show weaviate-client
Name: weaviate-client
Version: 3.26.7
...
But I don't see the performance improvement. Just use my example before. It is still around 600ms, I still don't get the 60ms (1/10 before) result I used to have
Hi, @nutriver. I'm Dosu, and I'm helping the Dify team manage their backlog and am marking this issue as stale.
Issue Summary:
- You reported a significant slowdown in knowledge base retrieval speed after upgrading from version 1.3.1 to 1.4.1 in a self-hosted Docker setup.
- Multiple users confirmed the issue persists through versions 1.4.x to 1.5.1, with benchmarks showing retrieval times increasing from ~60ms to several hundred milliseconds.
- Suggestions pointed to the Weaviate vector database client as the root cause, recommending upgrading the client to version 3.26.7 inside the Docker container.
- Some users, including you, reported that updating the client inside the container did not fully restore original performance, indicating the fix may be incomplete or requires additional steps.
Next Steps:
- Please let me know if this issue is still relevant with the latest version of Dify by commenting here to keep the discussion open.
- Otherwise, I will automatically close this issue in 15 days.
Thank you for your understanding and contribution!
1.8 the same
@crazywoola The user reports that the retrieval slowdown issue persists even in version 1.8. Could you please assist with investigating this ongoing performance problem?