.Net: Bug: Indeterministic Argument mapping time - The time required to map the user's intent sometimes takes too long.
Describe the bug I have implemented the PlugIn with a Kernel Function, which has many arguments. For example, the following code shows the declaration of the function:
[KernelFunction]
[Description("Performs the search operation related to Invoices.")]
public async Task<TestResult> TestArgumentMappingAsync(
[Description("The user's ask or intent")] string intent,
[Description("The customer number or Kundennummer. If not specified use let the system use the default one.")] string? customerNumber,
[Description("True if the user's intent is assotiated to the single document or invoice.")] bool? isSingleInvoiceRequested,
[Description("Specifies if the user's intent requires ascending or descending sorting direction.")] SortOrder? sortingDirection,
[Description("Invoices created after the given time.")] DateTime? fromTime,
[Description("Invoices created before the given time.")] DateTime? toTime,
[Description("The status of the invoice.")] InvoiceState status,
[Description("The docment/invoice number of the invoice.")] string? documentNo,
[Description("The docment/invoice description of the invoice.")] string? documentDescription,
[Description("The category of the document or invoice")] string? documentCategory,
[Description("The date of the document or invoice")] DateTime? documentDate,
[Description("The end date or expiration date of the document or invoice")] DateTime? documentEndDate,
[Description("The invoice number from the externl system.")] string? externalDocumetNumber,
[Description("The email of the orderer.")] string? ordererEmail,
[Description("The phone of the orderer.")] string? ordererPhone,
[Description("The currency of the document or invoice.")] string? documentCurrency,
[Description("The net value.")] Decimal? valueNet,
[Description("The operator for net value. User can ask value greather, equal or less than.")] QueryOperator? valueNetOperator,
[Description("The gross value.")] Decimal? valueGross,
[Description("The operator for gross value. User can ask value greather, equal or less than.")] QueryOperator? valueGrossOperator,
[Description("The description of the document or ivoice.")] string? description,
[Description("The name of the orderer of the document or invoice")] string? orderer,
[Description("The position of the document or invoice.")] decimal? docPosition,
[Description("The document o invoice line.")] string? docLine,
[Description("The article number of the document or invoice.")] string? articleNumber,
[Description("The article description of the document or invoice.")] string? articleDescription,
[Description("The ordered quantity.")] decimal? quantity,
[Description("The unit of measure.")] string? unit,
[Description("The serial number of the ordered article or product.")] string? serialNo,
[Description("Number at the customer address.")] string? addressNoCustomer,
[Description("The recipient of the order.")] string? recipient,
[Description("The delivery street of the order.")] string? deliveryStreet,
[Description("The delivery city of the order.")] string? deliveryCity,
[Description("The delivery postal code of the order.")] string? deliveryPostalCode,
[Description("The delivery country of the order.")] string? deliveryCountry,
[Description("Grouped BY Fields requested by one of agreggated operations.")] string[]? groupBy
)
{
...
}
Following is the code used to run the model. Note, this code is executed on every user intent. That means that the every intent uses a fresh new instance of the kernel.
var chatCompletionService = _kernel.GetRequiredService<IChatCompletionService>();
// Enable auto function calling
OpenAIPromptExecutionSettings openAIPromptExecutionSettings = new()
{
ToolCallBehavior = ToolCallBehavior.AutoInvokeKernelFunctions,
Temperature = 0,
ChatSystemPrompt = sysPrompt,
StopSequences = new List<string>
{
"It seems there was another error due to a missing argument",
"It seems there was another error due to missing information"
}
};
IAsyncEnumerable<StreamingChatMessageContent>? result = null;
try
{
result = chatCompletionService.GetStreamingChatMessageContentsAsync(
history,
executionSettings: openAIPromptExecutionSettings,
kernel: _kernel);
StringBuilder completeMsgLog = new();
await foreach (var message in result)
{
completeMsgLog.Append(message.ToString());
await WriteToStreamAsync(message.ToString(), response);
}
}
Problem Description
In general, everything works fine, meaning that the model typically takes 2-4 seconds to map the user's intent to function arguments. However, after repeatedly testing the same intent, I noticed that sometimes the argument mapping process takes 2-3 minutes.
Please note that this is not a coincidental observation—it is very easy to reproduce. For example, by sending the same intent 10 times, the long mapping delay occurs at least once.
The function shown above represents a test case. I have also tested the same function with 10 or fewer arguments, and the same behavior can be observed. However, since functions with fewer arguments are generally easier for the model to map, they do not take 2-3 minutes. Instead, the mapping process takes 15-20 seconds in such cases, instead of 1-2 seconds.
Recap
The process of mapping the intent to the arguments of the Kernel Function is mostly fast = 1-3 sec.(depending on the number of arguments). However, when invoking the same intent multiple times in succession, some of the invocations take minutes instead of seconds.
To me, this does not seem to be deterministic behavior.
Question: Has anybody observed such behaviour, how it can be monitored and how we can avoid it?
Hi @ddobric, we have not observed this behavior, nor have we heard of anyone experiencing a similar issue.
I tried to reproduce the problem on gpt-4o-mini, but I was unable to do so. I ran this code, which performed twenty sequential completions with two function calls per completion, and overall it took 1 minute and 36 seconds, which is approximately 5 seconds per completion.
Try running this code and observe the model's response status code. Those 2-3 minute delays might be caused by the model throttling your requests by sending "429 Too Many Requests" responses if it is configured with a low SKU/RPS. You might also try running with a different model to check if it's reproducible across the models.
using System.ComponentModel;
using System.Diagnostics;
using System.Reflection;
using System.Text;
using Azure.Identity;
using Microsoft.Extensions.Configuration;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.OpenAI;
namespace FunctionCalling;
public class FunctionCallingTests : IDisposable
{
private readonly HttpClient _httpClient;
private readonly ITestOutputHelper _output;
public FunctionCallingTests(ITestOutputHelper output)
{
this._output = output;
IConfigurationRoot configRoot = new ConfigurationBuilder()
.AddUserSecrets(Assembly.GetExecutingAssembly())
.Build();
TestConfiguration.Initialize(configRoot);
var handler = new CustomHandler(new HttpClientHandler(), output);
this._httpClient = new HttpClient(handler);
}
[Fact]
public async Task ReproduceArgsMappingIssueAsync()
{
IKernelBuilder builder = Kernel.CreateBuilder();
builder.AddAzureOpenAIChatCompletion(
deploymentName: TestConfiguration.AzureOpenAI.ChatDeploymentName,
endpoint: TestConfiguration.AzureOpenAI.Endpoint,
credentials: new AzureCliCredential(),
modelId: TestConfiguration.AzureOpenAI.ChatModelId,
httpClient: this._httpClient);
Kernel kernel = builder.Build();
kernel.ImportPluginFromType<Utils>();
OpenAIPromptExecutionSettings settings = new() { FunctionChoiceBehavior = FunctionChoiceBehavior.Auto() };
IChatCompletionService chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();
var sw = Stopwatch.StartNew();
for (int i = 0; i < 20; i++)
{
var stringBuilder = new StringBuilder();
await foreach (var update in chatCompletionService.GetStreamingChatMessageContentsAsync(
"Analyze latest invoice using AnalyzeInvoice function",
settings,
kernel))
{
stringBuilder.Append(update.Content);
}
this._output.WriteLine(stringBuilder.ToString());
}
sw.Stop();
this._output.WriteLine(sw.Elapsed);
}
public class TestArgumentMappingResult
{
}
public class InvoiceData
{
public string Intent { get; set; }
public string CustomerNumber { get; set; }
public bool? IsSingleInvoiceRequested { get; set; }
public SortOrder? SortingDirection { get; set; }
public DateTime? FromTime { get; set; }
public DateTime? ToTime { get; set; }
public InvoiceState Status { get; set; }
public string DocumentNo { get; set; }
public string DocumentDescription { get; set; }
public string DocumentCategory { get; set; }
public DateTime? DocumentDate { get; set; }
public DateTime? DocumentEndDate { get; set; }
public string ExternalDocumentNumber { get; set; }
public string OrdererEmail { get; set; }
public string OrdererPhone { get; set; }
public string DocumentCurrency { get; set; }
public decimal? ValueNet { get; set; }
public QueryOperator? ValueNetOperator { get; set; }
public decimal? ValueGross { get; set; }
public QueryOperator? ValueGrossOperator { get; set; }
public string Description { get; set; }
public string Orderer { get; set; }
public decimal? DocPosition { get; set; }
public string DocLine { get; set; }
public string ArticleNumber { get; set; }
public string ArticleDescription { get; set; }
public decimal? Quantity { get; set; }
public string Unit { get; set; }
public string SerialNo { get; set; }
public string AddressNoCustomer { get; set; }
public string Recipient { get; set; }
public string DeliveryStreet { get; set; }
public string DeliveryCity { get; set; }
public string DeliveryPostalCode { get; set; }
public string DeliveryCountry { get; set; }
public string[]? GroupBy { get; set; }
}
public enum SortOrder { Ascending, Descending }
public enum InvoiceState { Paid, Unpaid, Overdue }
public enum QueryOperator { Greater, Equal, Less }
public class Utils
{
[KernelFunction]
[Description("Performs an invoice analysis.")]
public TestArgumentMappingResult AnalyzeInvoice(
[Description("The user's ask or intent")] string intent,
[Description("The customer number or Kundennummer. If not specified use let the system use the default one.")] string? customerNumber,
[Description("True if the user's intent is assotiated to the single document or invoice.")] bool? isSingleInvoiceRequested,
//[Description("Specifies if the user's intent requires ascending or descending sorting direction.")] SortOrder? sortingDirection,
[Description("Invoices created after the given time.")] DateTime? fromTime,
[Description("Invoices created before the given time.")] DateTime? toTime,
//[Description("The status of the invoice.")] InvoiceState status,
[Description("The docment/invoice number of the invoice.")] string? documentNo,
[Description("The docment/invoice description of the invoice.")] string? documentDescription,
[Description("The category of the document or invoice")] string? documentCategory,
[Description("The date of the document or invoice")] DateTime? documentDate,
[Description("The end date or expiration date of the document or invoice")] DateTime? documentEndDate,
[Description("The invoice number from the externl system.")] string? externalDocumetNumber,
[Description("The email of the orderer.")] string? ordererEmail,
[Description("The phone of the orderer.")] string? ordererPhone,
[Description("The currency of the document or invoice.")] string? documentCurrency,
[Description("The net value.")] Decimal? valueNet,
//[Description("The operator for net value. User can ask value greather, equal or less than.")] QueryOperator? valueNetOperator,
[Description("The gross value.")] Decimal? valueGross,
//[Description("The operator for gross value. User can ask value greather, equal or less than.")] QueryOperator? valueGrossOperator,
[Description("The description of the document or ivoice.")] string? description,
[Description("The name of the orderer of the document or invoice")] string? orderer,
[Description("The position of the document or invoice.")] decimal? docPosition,
[Description("The document o invoice line.")] string? docLine,
[Description("The article number of the document or invoice.")] string? articleNumber,
[Description("The article description of the document or invoice.")] string? articleDescription,
[Description("The ordered quantity.")] decimal? quantity,
[Description("The unit of measure.")] string? unit,
[Description("The serial number of the ordered article or product.")] string? serialNo,
[Description("Number at the customer address.")] string? addressNoCustomer,
[Description("The recipient of the order.")] string? recipient,
[Description("The delivery street of the order.")] string? deliveryStreet,
[Description("The delivery city of the order.")] string? deliveryCity,
[Description("The delivery postal code of the order.")] string? deliveryPostalCode,
[Description("The delivery country of the order.")] string? deliveryCountry,
[Description("Grouped BY Fields requested by one of agreggated operations.")] string[]? groupBy)
{
return new TestArgumentMappingResult();
}
[KernelFunction]
public InvoiceData GetLatestInvoice()
{
return new InvoiceData
{
Intent = "ViewInvoiceDetails",
CustomerNumber = "123456",
IsSingleInvoiceRequested = false,
SortingDirection = SortOrder.Ascending,
FromTime = DateTime.Now.AddMonths(-1),
ToTime = DateTime.Now,
Status = InvoiceState.Paid,
DocumentNo = "INV-001",
DocumentDescription = "Invoice for services rendered",
DocumentCategory = "Services",
DocumentDate = DateTime.Now.AddDays(-30),
DocumentEndDate = DateTime.Now,
ExternalDocumentNumber = "EXT-INV-001",
OrdererEmail = "[email protected]",
OrdererPhone = "+1234567890",
DocumentCurrency = "USD",
ValueNet = 1000.00m,
ValueNetOperator = QueryOperator.Greater,
ValueGross = 1200.00m,
ValueGrossOperator = QueryOperator.Greater,
Description = "Standard invoice",
Orderer = "John Doe",
DocPosition = 1.00m,
DocLine = "Line 1",
ArticleNumber = "ART-001",
ArticleDescription = "Widget",
Quantity = 10.00m,
Unit = "pcs",
SerialNo = "SN-123456",
AddressNoCustomer = "789",
Recipient = "Jane Smith",
DeliveryStreet = "123 Main St",
DeliveryCity = "Anytown",
DeliveryPostalCode = "12345",
DeliveryCountry = "USA",
GroupBy = new string[] { "Category", "Date" }
};
}
}
private class CustomHandler(HttpMessageHandler innerHandler, ITestOutputHelper output) : DelegatingHandler(innerHandler)
{
private readonly ITestOutputHelper _output = output;
protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
{
var response = await base.SendAsync(request, cancellationToken);
this._output.WriteLine($"Status code: {response.StatusCode}");
return response;
}
}
public void Dispose()
{
this._httpClient.Dispose();
}
Hi Sergey,
thanks for your response. Honestly, I have also never seen this behaviour before. But I also never used so many arguments. Step by step, we are trying to understand why this sometimes happens. First of all, it seems that the semantic kernel does not correctly pass the temperature argument to the model. This is just my guess, because I set the temperature to 0, to be sure that the model behaves deterministically. Once is sure, the model process mapping of intent to arguments deterministic way.
Observation 1
For example, I have figured out that when the long processing happens the model streams the following output (chunk by chunk):
It seems there was an error due to missing sorting direction parameters. Let me correct that and try again. It seems there was another error due to missing status parameters. Let me fix that and try again. It seems there was another error due to missing document number parameters. Let me fix that and try again. . . . It seems there was another error due to missing records parameters. Let me fix that and try again. It seems there was another error due to missing group by parameters. Let me fix that and try again.
This takes minutes. After all, arguments are mapped correctly and the result is returned.
This all means that the model for the same intent, sometimes simply IS NOT ABLE to map arguments. I expect that temperature = 0 will always map the intent or NOT map the intent.
Observation 2
I also found out that Description A causes the model to much more often run into the previously described mapping issue than Description B.
A
[Description("The customer number or Kundennummer. If not specified let the system use the default one.")] string? customerNumber,
B
[Description("The customer number of the customer")] string? customerNumber,
Damir
I have figured out the following, which seems to be a bug:
All value-type arguments that are marked as nullable (int?, SomeEnum?, DateTime?... ) are correctly passed to the model as optional parameters.
However, reference-type arguments (such as string? in my case) are not treated as optional. This often leads to mapping errors, which significantly slow down the model’s performance.
Note, optional fields should be mapped as: "type": ["string", "null"],
Ref: https://platform.openai.com/docs/guides/function-calling?api-mode=chat
@SergeyMenshykh can you take a look at this issue and check that our strict mode implementation is correctly setting nulls, see: https://platform.openai.com/docs/guides/function-calling?api-mode=chat#strict-mode
@SergeyMenshykh can you take a look at this issue and check that our strict mode implementation is correctly setting nulls, see: https://platform.openai.com/docs/guides/function-calling?api-mode=chat#strict-mode
The strict mode implementation in SK correctly maps optional parameters. In this context, "optional" refers to parameters that have a default value, rather than nullable ones marked with a ?. For example, consider the following parameter definition:
public class RecordData {public string Name { get; set; }}
public void CreateRecord(RecordData? record = null)
{
}
In this case, SK generates the following schema:
{
"type": "object",
"required": [
"record"
],
"properties": {
"record": {
"type": [
"object",
"null"
],
"properties": {
"Name": {
"type": [
"string",
"null"
]
}
},
"additionalProperties": false,
"required": [
"Name"
]
}
},
"additionalProperties": false
}
@ddobric, both value-type and reference-type parameters that are marked as nullable, either by using the Nullable type or the ? shorthand notation, are not considered optional by either C# or SK. They are simply nullable; a null value can be assigned if needed.
To make a parameter optional in .NET and SK, and to have it marked as optional/not required in the JSON schema provided to AI, it should be given a default value, i.e., F1(User required, User? nullable, User? optional = null).
For example, I have figured out that when the long processing happens the model streams the following output (chunk by chunk): It seems there was an error due to missing sorting direction parameters. Let me correct that and try again. It seems there was another error due to missing status parameters. Let me fix that and try again.
It's SK attempting to auto-heal by intercepting a function invocation exception and sending it back to the model, expecting the model to analyze the failure and call the function again with the correct input/arguments. However, the root cause of this issue is not clear. Why did the model not provide arguments for the required parameters (yes, they are nullable but not optional)?
A [Description("The customer number or Kundennummer. If not specified let the system use the default one.")] string? customerNumber,
I wonder whether the sentence "If not specified, let the system use the default one" misleads the model into deciding on the parameter's optionality based on the description rather than its schema.
As next steps, please add default values to the parameters that should be optional to make them optional and rephrase the parameter descriptions in a way that does not imply optionality for non-optional parameters.
Additionally, I suggest keeping descriptions only for the parameters that models cannot reason about without them and removing the rest to save the number of input tokens.
Please also consider minimizing the number of parameters if possible. More recommendations can be found at: https://learn.microsoft.com/en-us/semantic-kernel/concepts/plugins/?pivots=programming-language-csharp#general-recommendations-for-authoring-plugins.
Thanks for your thoughts @SergeyMenshykh . I guess the following might be helpful:
This is the notation we currently use and makes the model try to set the parameter over and over again:
F1(User required, User? nullable, User? thisMakesModelGtCrazzy)
This might work. I will try:
F1(User required, User? nullable, User? optional = null)
The working solution, which I have now, suggests the model to use a predefined default value:
[Description("The customer number or Kundennummer. If not specified use '''.")] string?
F1(User required, User? nullable, User? optional = null)
The following does NOT work:
[Description("The customer number or Kundennummer. If not specified use null value'.")] string?
F1(User required, User? nullable, User? optional = null)
Regarding minimizing the number of arguments — yes, that would be helpful, and it's exactly what we're currently doing. However, requirements can often be shared across multiple functions and their arguments. Reducing the number of arguments could lead to an increase in the number of functions or even plugins, which should also be minimized.
The biggest problem in this case is the model’s indeterminism — I never know how it will behave. Moreover, if we switch to another version of the model or to a different model altogether, its behavior will change. This is the challenge we all need to tackle, somehow.
This might work. I will try: F1(User required, User? nullable, User? optional = null)
Hi @ddobric, did you have a chance to try it? If that is the case, did it work?
Closing for now. @ddobric, feel free to reopen it or create a new one if the suggestion above does not work.