Dify studio become very slow after uploading and embedding a large number of files
Self Checks
- [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
- [X] I have searched for existing issues search for existing issues, including closed ones.
- [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [X] Please do not modify this template :) and fill in all the required fields.
Dify version
0.6.14
Cloud or Self Hosted
Self Hosted (Docker)
Steps to reproduce
This issue has been met for two times. Create a knowledge base and upload a large number of files in one time, say 500 files. Choose high quality to using embedding model. After the upload and embedding is done 100%, the Dify become very slow, respond very slow when openning an app to edit or running a app. Issue can only be recovered by restart Dify dockers.
✔️ Expected Behavior
Dify respond normally after uploading/embedding large number of files are done.
❌ Actual Behavior
Dify respond very very sloe after uploading/embedding large number of files are done.
To resolve performance issues in Dify Studio after uploading and embedding a large number of files, you can try the following steps:
-
Limit the
max_tokensSetting: Reducing themax_tokenssetting can help lower the computational cost and time. For example, settingmax_tokens=500limits the output to the first 500 tokens, which can improve efficiency. -
Check for Rate Limits: Ensure that your embedding model API key has not reached its rate limit, as this can cause delays and performance issues.
-
Optimize Document Segmentation: Properly segment long texts to improve processing efficiency. For example, segment texts by paragraphs or sentences depending on the task.
-
Adjust Document Upload Limits: If you are using a locally deployed version, you can modify the document upload size and quantity limits to better suit your needs.
-
Use Appropriate Models: Choose models that are optimized for handling long texts and complex tasks. For example,
gpt-3.5-turboandgpt-4are recommended for their capabilities in understanding long text and cross-document reasoning. -
Monitor and Manage API Usage: Regularly check your API usage and ensure you have not exceeded your quota. If you encounter errors such as "Rate limit reached" or "Validation failed: You exceeded your current quota," you may need to upgrade your plan or manage your usage more effectively [1][2].
Additionally, you might want to consider increasing the hardware specifications of your self-hosted instance. The recommended minimum specifications are:
- CPU: At least 2 cores
- RAM: At least 4GB
For handling a large number of files, you might need to consider higher specifications depending on the specific workload and performance requirements. It is advisable to monitor the system's performance and scale up resources as needed [3][4].
To continue talking to Dosu, mention @dosu.
This reply does not answer my question.
Can your provide the container logs of api? Actually I am not sure about if this is related to frontend or backend.
OK, I will provide when the issue was reproduced.
Can your provide the container logs of api? Actually I am not sure about if this is related to frontend or backend.
I reproduced and found one abnormal that is overload of gunicorn. Once the system is stuck, the gunicorn cpu usage will become 100% for ever. The only way is to reboot the containers to recover. Please see below screen:
For containers logs, I dont see any obvious error logs.
See https://github.com/langgenius/dify/issues/7677#issue-2487894362