semantic-kernel .Net: Bug: Semantic Kernel inflates token usage of input prompt by 20% due to Unicode escaping and URL encoding of the prompt

.Net: Bug: Semantic Kernel inflates token usage of input prompt by 20% due to Unicode escaping and URL encoding of the prompt

Open Evyatar108 opened this issue 1 year ago • 1 comments

trafficstars

Describe the bug the semantic kernel package in c# is inflating the usage tokens by around 20% for my prompts because it escapes Unicode characters and URL encodes the string of the content field of the messages array.

I tested it on a prompt that has 49k input tokens based on the metadata information from the Azure OpenAI api, and when using semantic kernel it inflates the input tokens to 63k

It's very easy to see the prompt differences when it has JSON in it

Platform

OS: Windows
IDE: Visual Studio
Language: C#
Source: NuGet package version 1.15

Jun 26 '24 12:06 Evyatar108

And it's not just the consumption of tokens, it's that it costs more for the LLM to process it. When receiving encrypted data, the responses are sent to you encoded, when it should not be that way. In my case using Gemini.

Jun 27 '24 07:06 blinchi

@markwallace-microsoft can you please check #7052

Jul 15 '24 07:07 atiq-bs23

Fixed via https://github.com/microsoft/semantic-kernel/pull/7098

Jul 29 '24 11:07 markwallace-microsoft

This is not fixed in dotnet version 1.16.2. When the prompt is generated it still sends it with escaped characters.

This occurs when a prompt is passed:

Jul 31 '24 12:07 blinchi

Do we have a right way to see a prompt is actually sent to the azure open ai instead of checking chat history message? I have the same issue here but with function calling, somehow uses it a lot of tokens even though the function returns a very simple data and end up with thousands of tokens used

Aug 10 '24 04:08 sonphnt

Do we have a right way to see a prompt is actually sent to the azure open ai instead of checking chat history message? I have the same issue here but with function calling, somehow uses it a lot of tokens even though the function returns a very simple data and end up with thousands of tokens used

@sonphnt Yes, if you enable LogLevel.Trace in your application, you will be able to see the prompt that is sent to AI connector, like on the screenshot in previous comment.

Aug 12 '24 14:08 dmytrostruk

Any update about it ? Looking at the sources it seems it comes from the way the chat history is logged using JsonSerializer without any option.

Locally I can reproduce by serializing a TextContent, then I can fix with this options: JsonSerializerOptions options = new () { Encoder = JavaScriptEncoder.UnsafeRelaxedJsonEscaping };

Perhaps being able to pass down this option would be a way to fix this. It is quite destabilizing when looking at the logs. Moreover, I do not feel entirely confident on how the data is really sent at the end.

Nov 18 '24 13:11 alexandreakdeniz-soc

semantic-kernel semantic-kernel copied to clipboard

.Net: Bug: Semantic Kernel inflates token usage of input prompt by 20% due to Unicode escaping and URL encoding of the prompt

semantic-kernel
semantic-kernel copied to clipboard