semantic-kernel icon indicating copy to clipboard operation
semantic-kernel copied to clipboard

Function inputs and escaping special characters with Unicode

Open sophialagerkranspandey opened this issue 1 year ago • 1 comments

Discussed in https://github.com/microsoft/semantic-kernel/discussions/7308

Originally posted by glorious-beard July 16, 2024 If I set a kernel argument to content containing special characters, (HTML tags, for example), and I look at the logger output from the kernel when it's invoking the function, I notice that the JSON object escapes all of the special character.

For example, if I set "input" to "

Version 1.2

....", the function argument looks like:
{"input":"\u003Cp\u003EVersion 1.2\u003C/p\u003E..."}

Two questions:

  1. Do the extra characters in escaping "<" and ">" with 5 additional characters incur extra token cost?
  2. Does the function call unescape these characters before it is sent to the LLM endpoint?

sophialagerkranspandey avatar Jul 19 '24 15:07 sophialagerkranspandey

I'm pretty sure that the characters are only encoded so we can print the log statement (so it shouldn't impact your logic), but adding folks to verify.

madsbolaris avatar Jul 19 '24 15:07 madsbolaris

In python, I can confirm, they are unescaped before being sent to the model, this happens within the from_element method for chat, and within the _invoke_internal method for text, hence it also does not add extra tokens (although tokenization on the model side might). @sophialagerkranspandey @glorious-beard

eavanvalkenburg avatar Jul 22 '24 08:07 eavanvalkenburg

We have protection to prevent prompt injection attacks which will encode potentially dangerous tags. If you trust the content you can change this behaviour, take a look at this sample to see the available options: https://github.com/microsoft/semantic-kernel/blob/main/dotnet/samples/Concepts/ChatPrompts/SafeChatPrompts.cs

markwallace-microsoft avatar Jul 24 '24 12:07 markwallace-microsoft

Closing this issue since it's handled in both C# and Python

madsbolaris avatar Jul 26 '24 15:07 madsbolaris