The tool description in prompt is Ascii
Self Checks
- [X] This is only for bug report, if you would like to ask a question, please head to Discussions.
- [X] I have searched for existing issues search for existing issues, including closed ones.
- [X] I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
- [X] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
- [X] Please do not modify this template :) and fill in all the required fields.
Dify version
0.6.15
Cloud or Self Hosted
Self Hosted (Docker), Self Hosted (Source)
Steps to reproduce
May I ask why the tool information in the prompt is all in Ascii encoding format? Is this an error? I previously directly modified the secure_ascii=True parameter of the dump in my local JSON package to fix this issue. I would like to know if this is necessary
✔️ Expected Behavior
No response
❌ Actual Behavior
No response
The tool information in the prompt is in ASCII encoding format because the ensure_ascii parameter is set to True in the JSON serialization process. This is not an error but a deliberate setting in the codebase.
For example, in the unauthorized_handler function, the json.dumps method is used with ensure_ascii=True:
@login_manager.unauthorized_handler
def unauthorized_handler():
"""Handle unauthorized requests."""
return Response(json.dumps({
'code': 'unauthorized',
'message': "Unauthorized."
}, ensure_ascii=True), status=401, content_type="application/json")
If you want to change this behavior, you can modify the ensure_ascii parameter to False in the relevant parts of the code where JSON serialization occurs [1].
To continue talking to Dosu, mention @dosu.
@dosu What is the purpose of setting this
Setting ensure_ascii=True in the JSON serialization process ensures that the output is ASCII-only by escaping all non-ASCII characters using \uXXXX sequences. This can be useful for compatibility with systems that may not handle non-ASCII characters properly.
Changing ensure_ascii to False allows the JSON encoder to output non-ASCII characters as-is, making the JSON data more readable and compact if it contains many non-ASCII characters.
Example:
import json
data = {"message": "Hola, ¿cómo estás?"}
# ensure_ascii=True
json_data_ascii = json.dumps(data, ensure_ascii=True)
print(json_data_ascii) # Output: {"message": "Hola, \u00bfcómo est\u00e1s?"}
# ensure_ascii=False
json_data_non_ascii = json.dumps(data, ensure_ascii=False)
print(json_data_non_ascii) # Output: {"message": "Hola, ¿cómo estás?"}
In the first case, non-ASCII characters are escaped, while in the second case, they are preserved [1].
To continue talking to Dosu, mention @dosu.
But the prompts are all Ascii encoded, can LLMs understand @dosu
LLMs (Large Language Models) can understand prompts encoded in ASCII as well as those containing non-ASCII characters. However, encoding prompts in ASCII (using \uXXXX sequences for non-ASCII characters) can make the text less readable for humans but does not affect the model's ability to process the information.
If readability is a concern and you want to ensure that non-ASCII characters are preserved in their original form, you can modify the ensure_ascii parameter to False in the JSON serialization process. This will allow the JSON encoder to output non-ASCII characters as-is, making the prompts more readable.
For example, in the unauthorized_handler function, you can change:
return Response(json.dumps({
'code': 'unauthorized',
'message': "Unauthorized."
}, ensure_ascii=True), status=401, content_type="application/json")
to:
return Response(json.dumps({
'code': 'unauthorized',
'message': "Unauthorized."
}, ensure_ascii=False), status=401, content_type="application/json")
This change will preserve non-ASCII characters in their original form in the JSON output.
To continue talking to Dosu, mention @dosu.