Tao Chen
Tao Chen
Yes, there is not an endpoint argument to the connector, but you can provide a custom client. I will link the PR to this issue.
I think HuggingFace will be the most controversial one. It's neither a model provider nor a service provider. You can use its inference endpoint but for production people may use...
I removed some of the most controversial values. The remaining ones are known to have client SDKs.
`time-to-first-token` and `time-to-next-token` could be hard to capture by some SDKs since a single chunk returned by some APIs may contain multiple tokens. Will `time-to-first-response` make more sense? Another option...
Regarding our offline conversation on the prompt template, is using a prompt template to parse the chat history to some format an overkill? Prompt template can do much more that...
Thanks for the contribution! Will approve once the unit tests pass.
I am able to reproduce this issue on azure-ai-agents==1.1.0 but not azure-ai-agents==1.2.0b1 and azure-ai-agents==1.1.0b4. May be worth reaching out to the Azure AI SDK team. A temporary solution for us...
Hi @sushaanttb, Thank you for posting the issue! Are you looking for the token usage of just the final response or the total toke usage of the entire run of...
Hi @sushaanttb, In Python, the result of the group chat orchestration is produced by the manager's `filter_results` method: https://learn.microsoft.com/en-us/semantic-kernel/frameworks/agent/agent-orchestration/group-chat?pivots=programming-language-python#customize-the-group-chat-manager. The return value of the `filter_results` method is a [`MessageResult`](https://github.com/microsoft/semantic-kernel/blob/main/python/semantic_kernel/agents/orchestration/group_chat.py#L90). If...
> [@TaoChenOSU](https://github.com/TaoChenOSU) - This is flagged for both Python and .NET analysis. Will look at the .Net side too