dify Dify studio become very slow after uploading and embedding a large number of files

Self Checks

[X] This is only for bug report, if you would like to ask a question, please head to Discussions.
[X] I have searched for existing issues search for existing issues, including closed ones.
[X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[X] [FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
[X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.6.14

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

This issue has been met for two times. Create a knowledge base and upload a large number of files in one time, say 500 files. Choose high quality to using embedding model. After the upload and embedding is done 100%, the Dify become very slow, respond very slow when openning an app to edit or running a app. Issue can only be recovered by restart Dify dockers.

✔️ Expected Behavior

Dify respond normally after uploading/embedding large number of files are done.

❌ Actual Behavior

Dify respond very very sloe after uploading/embedding large number of files are done.

Aug 06 '24 06:08 ddnoodle

To resolve performance issues in Dify Studio after uploading and embedding a large number of files, you can try the following steps:

Limit the max_tokens Setting: Reducing the max_tokens setting can help lower the computational cost and time. For example, setting max_tokens=500 limits the output to the first 500 tokens, which can improve efficiency.
Check for Rate Limits: Ensure that your embedding model API key has not reached its rate limit, as this can cause delays and performance issues.
Optimize Document Segmentation: Properly segment long texts to improve processing efficiency. For example, segment texts by paragraphs or sentences depending on the task.
Adjust Document Upload Limits: If you are using a locally deployed version, you can modify the document upload size and quantity limits to better suit your needs.
Use Appropriate Models: Choose models that are optimized for handling long texts and complex tasks. For example, gpt-3.5-turbo and gpt-4 are recommended for their capabilities in understanding long text and cross-document reasoning.
Monitor and Manage API Usage: Regularly check your API usage and ensure you have not exceeded your quota. If you encounter errors such as "Rate limit reached" or "Validation failed: You exceeded your current quota," you may need to upgrade your plan or manage your usage more effectively ^[1]^[2].

Additionally, you might want to consider increasing the hardware specifications of your self-hosted instance. The recommended minimum specifications are:

CPU: At least 2 cores
RAM: At least 4GB

For handling a large number of files, you might need to consider higher specifications depending on the specific workload and performance requirements. It is advisable to monitor the system's performance and scale up resources as needed ^[3]^[4].

To continue talking to Dosu, mention @dosu.

Aug 06 '24 06:08 dosubot[bot]

This reply does not answer my question.

Aug 06 '24 06:08 ddnoodle

Can your provide the container logs of api? Actually I am not sure about if this is related to frontend or backend.

Aug 06 '24 07:08 crazywoola

OK, I will provide when the issue was reproduced.

Aug 06 '24 08:08 ddnoodle

Can your provide the container logs of api? Actually I am not sure about if this is related to frontend or backend.

I reproduced and found one abnormal that is overload of gunicorn. Once the system is stuck, the gunicorn cpu usage will become 100% for ever. The only way is to reboot the containers to recover. Please see below screen: gunicorn stuck

For containers logs, I dont see any obvious error logs.

Aug 27 '24 08:08 ddnoodle

See https://github.com/langgenius/dify/issues/7677#issue-2487894362

Sep 04 '24 14:09 crazywoola