instructor icon indicating copy to clipboard operation
instructor copied to clipboard

fix: ensure that utf-8 characters are not translated into \uXXXX format

Open largomst opened this issue 1 year ago • 0 comments

If characters such as Chinese are translated into \uXXXX format in the comments of the data model, the LLM will not be able to understand the meaning, so it is necessary to ensure that utf-8 characters are not translated.


:rocket: This description was created by Ellipsis for commit f0935506defa7359aeff248d7cf58cb8fdc6668e

Summary:

Ensure UTF-8 characters are not translated into \uXXXX format in handle_response_model function in instructor/process_response.py.

Key points:

  • Updated instructor/process_response.py.
  • Modified handle_response_model function.
  • Set ensure_ascii=True in json.dumps calls to prevent UTF-8 characters from being translated into \uXXXX format.
  • Affected lines: 261, 335, 382.

Generated with :heart: by ellipsis.dev

largomst avatar Jul 11 '24 02:07 largomst