dify icon indicating copy to clipboard operation
dify copied to clipboard

chat-messages API takes more time when conversation_id is set

Open yg1031 opened this issue 8 months ago • 3 comments

Self Checks

  • [x] This is only for bug report, if you would like to ask a question, please head to Discussions.
  • [x] I have searched for existing issues search for existing issues, including closed ones.
  • [x] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [x] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [x] Please do not modify this template :) and fill in all the required fields.

Dify version

1.1.3

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

The first-packet latency of chat-messages API is more than 5 seconds when conversation_id is set. However, if conversation_id is set to empty, first-packet latency is less than 2 seconds, but without chat history, that means the chatflow cannot remember chat history. If I run the chatflow on local web page, the first-packet latency is about 2 seconds. Moreover, auto_generate_name does not affect first token latency.

✔️ Expected Behavior

The first-packet latency of API calls should be similar to the latency observed during web page debugging.

❌ Actual Behavior

No response

yg1031 avatar Apr 25 '25 10:04 yg1031

Hello @yg1031! I'm Dosu, your friendly bot here to help you with bugs, questions, and anything else you might need while waiting for a human maintainer. Let's tackle this issue together! 🚀

I found some relevant information that might help address the issue you're experiencing with the chat-messages API when conversation_id is set.

  1. Handling Chat History: The chat-messages API uses the conversation_id to fetch and manage the chat list associated with a specific conversation. This process can introduce latency due to data fetching and processing, especially if the chat history is extensive or the network is slow [1].

  2. Potential Causes of Latency:

    • Data Fetching: The use of useSWR for fetching data can introduce latency, particularly if the data is large or the network conditions are poor.
    • Data Processing: Formatting the chat list and building the chat item tree can add latency, especially with many messages to process.
    • Network and Server Performance: Poor network conditions or server performance issues can also contribute to increased latency [1].
  3. Similar Discussions: There is a similar discussion about the chat-messages API responding slowly in streaming mode, which might provide additional insights or solutions. You can view it here [2].

These points might help you identify potential areas to optimize or investigate further in your setup.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar Apr 25 '25 10:04 dosubot[bot]

Hello @yg1031! I'm Dosu, your friendly bot here to help you with bugs, questions, and anything else you might need while waiting for a human maintainer. Let's tackle this issue together! 🚀

I found some relevant information that might help address the issue you're experiencing with the chat-messages API when conversation_id is set.

  1. Handling Chat History: The chat-messages API uses the conversation_id to fetch and manage the chat list associated with a specific conversation. This process can introduce latency due to data fetching and processing, especially if the chat history is extensive or the network is slow [1].

  2. Potential Causes of Latency:

    • Data Fetching: The use of useSWR for fetching data can introduce latency, particularly if the data is large or the network conditions are poor.
    • Data Processing: Formatting the chat list and building the chat item tree can add latency, especially with many messages to process.
    • Network and Server Performance: Poor network conditions or server performance issues can also contribute to increased latency [1].
  3. Similar Discussions: There is a similar discussion about the chat-messages API responding slowly in streaming mode, which might provide additional insights or solutions. You can view it here [2].

These points might help you identify potential areas to optimize or investigate further in your setup.

To continue talking to Dosu, mention @dosu.

Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

Hello @dosu.In other words, handling chat history adds over 3 seconds of first-packet latency, and this 3-second delay happens every single API call. Is there a way to keep the first-packet latency under 2 seconds while still keeping chat history working, like how it behaves when testing locally on a web page?

yg1031 avatar Apr 25 '25 10:04 yg1031

To reduce the first-packet latency while maintaining chat history, consider the following optimizations:

  1. Server Configuration: Adjust server settings to improve performance. Increase the number of API server workers (SERVER_WORKER_AMOUNT) and concurrent connections (SERVER_WORKER_CONNECTIONS) to handle more requests efficiently. Also, consider increasing the database connection pool size (SQLALCHEMY_POOL_SIZE) to reduce latency in database operations [1].

  2. Caching Mechanisms: Implement caching strategies to store frequently accessed data, reducing the need to fetch the same data repeatedly. This can significantly decrease latency for subsequent API calls.

  3. Database Optimization: Ensure that your database queries are optimized. Use indexing to speed up data retrieval processes, especially for large datasets.

  4. SWR Configuration: Optimize the useSWR hook by adjusting revalidation options and cache management strategies. This can help reduce unnecessary data fetching and improve response times [2].

  5. Network and Infrastructure: Check your network conditions and server infrastructure. Ensure that they are optimized for high performance, as poor network conditions can contribute to latency.

By implementing these strategies, you should be able to reduce the latency while keeping chat history intact.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar Apr 25 '25 11:04 dosubot[bot]

Conversations Info are stored in the database. Has your service been running for a long time with many historical sessions? If so, the session table in the database would be very large.

hieheihei avatar Apr 30 '25 03:04 hieheihei

Conversations Info are stored in the database. Has your service been running for a long time with many historical sessions? If so, the session table in the database would be very large. Thanks, yep, that was the issue! I trimmed the chat history down to 10 messages, and now it’s fast again.

yg1031 avatar May 06 '25 09:05 yg1031