Python: Docs could be improved around AzureChatCompletion
Cross linking to this https://github.com/MicrosoftDocs/semantic-kernel-docs/issues/285 - you can close as dupe.
For convenience, pasting 285 here:
This page https://learn.microsoft.com/en-us/semantic-kernel/concepts/ai-services/chat-completion/?tabs=csharp-AzureOpenAI%2Cpython-AzureOpenAI%2Cjava-AzureOpenAI&pivots=programming-language-python tells me that I need a deployed model to use chat
chat_completion_service = AzureChatCompletion(
deployment_name="my-deployment",
api_key="my-api-key",
endpoint="my-api-endpoint", # Used to point to your service
service_id="my-service-id", # Optional; for targeting specific services within Semantic Kernel
)
But this notebook https://github.com/microsoft/semantic-kernel/blob/main/python/samples/getting_started/03-prompt-function-inline.ipynb
shows use of
ai_model_id="gpt-3.5-turbo",
which I don't have deployed in AI Foundry - so i assume something else is going on not covered by the docs.
Hi @gbm, thanks for filing the issue. The AzureChatPromptExecutionSettings does contain the ai_model_id attribute. The ai_model_id when provided via the PromptExecutionSettings takes precedence over the ai_model_id specified as part of the AI Service creation/constructor. For an AI Service like AzureChatCompletion, the ai_model_id is configured from the deployment_name. If, per a particular function invocation, or agent invocation, I want to override the class-level settings and use a different model, I can configure the AzureChatPromptExecutionSettings with a new ai_model_id and it can be used for that particular run, set of runs, or whatever other logic you have.
I will make sure the docs have this clarity, which hopefully makes it less confusing for the dev/reader.
Hi @moonbox3 - thanks. My expectation from reading the docs - and it's possibly incorrect - is that AzureChatCompletion relies on my own deployed model. But since I don't have a gpt-3.5-turbo deployed this implies (to me) that the notebook is actually consuming a AzureChatCompletion service in a SaaS way. Since the notebooks are just a quick run through of code demos before digging into the samples in more detail, I don't want to be surprised (or confused) here. It's a cleaner on-ramp if this is (own model vs SaaS) spelled out.
Hi @gbm,
Thanks for your response.
My expectation from reading the docs - and it's possibly incorrect - is that AzureChatCompletion relies on my own deployed model. But since I don't have a gpt-3.5-turbo deployed this implies (to me) that the notebook is actually consuming a AzureChatCompletion service in a SaaS way.
There are some callouts about the deployment used for AzureChatCompletion, but yes, this could be improved: this deployment_name of gpt-3.5-turbo is a placeholder. The deployment_name needs to point to an actual deployment as part of your Azure OpenAI resource.
Hi @moonbox3 - thanks. Yes it would be great if you could
- spell out in the docs that chat is backed by our own deployed model and that there is no fall through to an OpenAI hosted model. Nice to have this super clear for an enterprise.
- add a comment to the jupyter notebook to make clear that the execution settings are taking precedence and the argument passed here is just a placeholder. Thanks !
@moonbox3 Some more thoughts on this if that's ok. The value here (for my use case) is chat (AI) services within AI Foundry - and within the Microsoft ecosystem. Also a driver for use of SK. So slightly restructuring the docs or jupyter "getting started" to both show other 3rd party AI services being used but also a clear differentiation/example of where Azure AI only is being used. If that makes sense. It's a useful feature. Just top of mind, thought I'd share.
@moonbox3 Just to dump some more thoughts on this issue (sorry!), the jupyter notebooks link to here Azure Open AI Service key in the .env setup sections. The only guidance to model choice for chatbots on that page is gpt-35-turbo-instruct (which conflicts slightly with the use of gpt-3.5-turbo in the notebook). I don't personally have that deployed so the results I get back from the chatbot are perhaps slightly different (in a non deterministic way) for what might be expected. It would be good to give a little more obvious choice over model choice here in the getting started - since the only other guidance I can easily find is on this page https://learn.microsoft.com/en-us/azure/ai-foundry/foundry-models/concepts/models
This issue is stale because it has been open for 90 days with no activity.