ragas icon indicating copy to clipboard operation
ragas copied to clipboard

Fix: Preserve non ascii characters when json.dumps and json.dump

Open dlutwy opened this issue 1 year ago • 0 comments

Description

This pull request addresses an issue related to the use of the json.dumps and json.dump methods in the prompt construction process. Currently, the ensure_ascii parameter is not set, which results in non-ASCII characters being escaped in the \uXXXX format. This can hinder the LLM's ability to understand the intended language meaning.

Changes Made

  • Updated the json.dumps and json.dump methods calls to include the parameter ensure_ascii=False. This change will retain non-ASCII characters in their original form, improving the LLM's comprehension of the input.

Impact

By allowing non-ASCII characters to remain unescaped, this modification will enhance the language processing capabilities of our model, leading to more accurate interpretations.

Sample Code

from ragas.metrics import context_recall

print(context_recall.context_recall_prompt.format(
    question='超级碗第一次举办是在什么时候 xx सूरज को क्या चलाता',
    context="\n".join(['第一次AFL–NFL世界冠军赛是一场美式足球比赛,于1967年1月15日在洛杉矶纪念体育场举行,位于洛杉矶。']),
    answer='第一次超级碗于1967年1月15日举行'
).to_string())

Before

...
Your actual task:

question: "\u8d85\u7ea7\u7897\u7b2c\u4e00\u6b21\u4e3e\u529e\u662f\u5728\u4ec0\u4e48\u65f6\u5019 xx \u0938\u0942\u0930\u091c \u0915\u094b \u0915\u094d\u092f\u093e \u091a\u0932\u093e\u0924\u093e"
context: "\u7b2c\u4e00\u6b21AFL\u2013NFL\u4e16\u754c\u51a0\u519b\u8d5b\u662f\u4e00\u573a\u7f8e\u5f0f\u8db3\u7403\u6bd4\u8d5b\uff0c\u4e8e1967\u5e741\u670815\u65e5\u5728\u6d1b\u6749\u77f6\u7eaa\u5ff5\u4f53\u80b2\u573a\u4e3e\u884c\uff0c\u4f4d\u4e8e\u6d1b\u6749\u77f6\u3002"
answer: "\u7b2c\u4e00\u6b21\u8d85\u7ea7\u7897\u4e8e1967\u5e741\u670815\u65e5\u4e3e\u884c"
classification: 

After

...
Your actual task:

question: "超级碗第一次举办是在什么时候 xx सूरज को क्या चलाता"
context: "第一次AFL–NFL世界冠军赛是一场美式足球比赛,于1967年1月15日在洛杉矶纪念体育场举行,位于洛杉矶。"
answer: "第一次超级碗于1967年1月15日举行"
classification: 

Please let me know if there are any questions or further changes needed.

dlutwy avatar Sep 20 '24 01:09 dlutwy