dify icon indicating copy to clipboard operation
dify copied to clipboard

API responses return Unicode escape sequences instead of non-ASCII characters

Open zwbao opened this issue 1 year ago • 2 comments

Self Checks

  • [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
  • [X] I have searched for existing issues search for existing issues, including closed ones.
  • [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [X] [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • [X] Please do not modify this template :) and fill in all the required fields.

Dify version

0.7.3

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

  1. Setup Dify using the provided instructions in the repository.
  2. Make a request that generates a response containing non-ASCII characters (e.g., Chinese characters) via the Dify API.
  3. Observe the response containing Unicode escape sequences instead of actual non-ASCII characters (e.g., \u60a3\u8005 instead of 患者).

Example request:

curl -X POST 'http://your-dify-instance/v1/workflows/run' \
--header 'Authorization: Bearer your-api-key' \
--header 'Content-Type: application/json' \
--data-raw '{
    "inputs": {"input": "没事"},
    "response_mode": "blocking",
    "user": "test-user"
}'

Example response:

{"task_id": "425b967c-2d14-4258-971f-663ed7fa9538", "workflow_run_id": "16f459fd-8c85-46b3-851b-b093486315c5", "data": {"id": "16f459fd-8c85-46b3-851b-b093486315c5", "workflow_id": "0e5e69c6-ebb4-4f3f-8161-030dced6140d", "status": "succeeded", "outputs": {"text": "\u6ca1\u4e8b"}, "error": null, "elapsed_time": 0.0038245338946580887, "total_tokens": 0, "total_steps": 2, "created_at": 1725614787, "finished_at": 1725614787}}

✔️ Expected Behavior

The API should return the response with actual non-ASCII characters (e.g., Chinese characters) directly in the output without Unicode escape sequences. For example:

{"task_id": "425b967c-2d14-4258-971f-663ed7fa9538", "workflow_run_id": "16f459fd-8c85-46b3-851b-b093486315c5", "data": {"id": "16f459fd-8c85-46b3-851b-b093486315c5", "workflow_id": "0e5e69c6-ebb4-4f3f-8161-030dced6140d", "status": "succeeded", "outputs": {"text": "没事"}, "error": null, "elapsed_time": 0.0038245338946580887, "total_tokens": 0, "total_steps": 2, "created_at": 1725614787, "finished_at": 1725614787}}

❌ Actual Behavior

The API returns the response with non-ASCII characters encoded as Unicode escape sequences, making it difficult to read directly:

{"task_id": "425b967c-2d14-4258-971f-663ed7fa9538", "workflow_run_id": "16f459fd-8c85-46b3-851b-b093486315c5", "data": {"id": "16f459fd-8c85-46b3-851b-b093486315c5", "workflow_id": "0e5e69c6-ebb4-4f3f-8161-030dced6140d", "status": "succeeded", "outputs": {"text": "\u6ca1\u4e8b"}, "error": null, "elapsed_time": 0.0038245338946580887, "total_tokens": 0, "total_steps": 2, "created_at": 1725614787, "finished_at": 1725614787}}

zwbao avatar Sep 06 '24 09:09 zwbao

Hey @zwbao, I'm here to help you with any bugs, questions, or contributions you might have for Dify. Let's squash those bugs and get your questions answered!

The issue you're encountering with Unicode escape sequences in API responses is likely due to the default JSON serialization settings in the Flask framework, which Dify uses for handling HTTP requests and responses [1].

To address this, you can modify the JSON encoder settings in your Flask application to ensure that non-ASCII characters are returned directly instead of as Unicode escape sequences. You can achieve this by setting the ensure_ascii parameter to False in the jsonify function or by customizing the JSON encoder globally.

Here's an example of how you can customize the JSON encoder globally in your Flask application:

from flask import Flask, jsonify
import json

class CustomJSONEncoder(json.JSONEncoder):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.ensure_ascii = False

app = Flask(__name__)
app.json_encoder = CustomJSONEncoder

@app.route('/example')
def example():
    response_data = {
        "text": "没事"
    }
    return jsonify(response_data)

if __name__ == '__main__':
    app.run()

By setting ensure_ascii to False, the API responses will include non-ASCII characters directly, making them more readable.

If you need to apply this change to the entire Dify application, you would need to locate the main Flask application instance and modify its JSON encoder as shown above. This should resolve the issue with Unicode escape sequences in your API responses.

To continue talking to Dosu, mention @dosu.

dosubot[bot] avatar Sep 06 '24 09:09 dosubot[bot]

I think this is fine, unless you curl directly, most programs will automatically handle this problem

Hisir0909 avatar Sep 07 '24 07:09 Hisir0909

@dosu where dify to make the SSE back using unicode form, how can I modify this code so that it returns Chinese not unicode?

Sos-Zachary avatar May 23 '25 02:05 Sos-Zachary

To modify Dify so that SSE (Server-Sent Events) responses return Chinese characters directly (not as Unicode escape sequences), you need to address how the backend serializes JSON and how the frontend/client decodes and displays the data.

Where SSE responses are handled

  • The frontend code that processes SSE responses is in web/service/base.ts, specifically in the handleStream function. This function reads the SSE stream, decodes it as UTF-8, and parses the JSON payload. When it receives a message event, it uses a function called unicodeToChar to convert Unicode escape sequences to actual characters before passing them to the UI.
    See code reference

  • The backend (likely Python/Flask) is responsible for serializing the data sent over SSE. By default, Python's json.dumps() uses ensure_ascii=True, which escapes non-ASCII characters as Unicode sequences.

How to modify the code

1. Backend: Ensure non-ASCII characters are not escaped

You need to set ensure_ascii=False when serializing JSON for SSE responses in the backend. For example, if you are using Flask to stream responses:

import json
from flask import Response

def sse_response(data):
    # Ensure non-ASCII characters are not escaped
    json_data = json.dumps(data, ensure_ascii=False)
    return Response(f"data: {json_data}\n\n", mimetype='text/event-stream')

Find where your SSE responses are constructed and make sure ensure_ascii=False is set when calling json.dumps().

2. Frontend: Decoding is already handled

On the frontend, the handleStream function already uses unicodeToChar to convert Unicode escape sequences to characters. If the backend sends actual Chinese characters (not Unicode escapes), this function will simply pass them through as-is, and the UI will display them correctly.

Summary

  • Update the backend SSE response serialization to use ensure_ascii=False so that Chinese characters are sent directly.
  • The frontend is already set up to handle and display these characters.

If you make this change, SSE responses should return Chinese characters directly instead of Unicode escape sequences, making them readable in the client UI and when using tools like curl or browser dev tools.

Relevant code references:

If you need help finding the exact backend file, let me know your deployment details (Python version, framework, etc.), and I can help you locate it more precisely.

To continue talking to Dosu, mention @dosu-bot.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Bug Report | Other

dosubot[bot] avatar May 23 '25 02:05 dosubot[bot]

@dosu in this project, Where is the backend code file that performs SSE output via API, I need to modify it to not do Unicode escape sequences

Sos-Zachary avatar May 23 '25 10:05 Sos-Zachary