dspy
dspy copied to clipboard
500 internal error when calling Azure OpenAI GPT4 with dspy
Hello all!
Currently experiencing 500 internal errors when using DSPY with Azure OpenAI gpt-4.
I have been using the Azure OpenAI client to ping a gpt-4 deployment, and am consistently getting: Internal service error 500. I've also pinged using litellm for a manual prompting pipeline I've set up alongside it and have not seen any issues / errors returned. indicating it's not an Azure side issue - nor a ratelimit issue since I did them right back to back but not together.
I've checked that both dspy's azure openai client and my litellm pipeline correctly sends a message with keys 'role' and 'content'.. I can't seem to find the issue.
The signature and module setup I have is as follows:
class QuestionAnswerSignature(dspy.Signature):
context: str = dspy.InputField()
question: str = dspy.InputField()
choices: list[str] = dspy.InputField(format=list)
answer: QA_output = dspy.OutputField(desc="key-value pairs")
class QuestionAnswer(dspy.Module):
def __init__(self):
super().__init__()
self.pick_answer = dspy.Predict(QuestionAnswerSignature)
def forward(self, context, question, choices):
output = self.pick_answer(
context = context,
question=question,
choices=choices
)
# suggest that length should be less than 5 tokens
dspy.Assert(len(output) < 100)
return output
And the actual instantiation of the client I have is as shown below.
gpt_dspy = AzureOpenAI(
model=model_name,
api_base = gpt_client.api_base,
api_version = gpt_client.api_version,
api_key = gpt_client._get_token(),
model_type = 'chat',
)
dspy.settings.configure(lm=gpt_dspy)
The request is sent correctly to the right address, as can be seen by the 500 + the logs also indicate the right URL is being used.
The only other thing I can think of is that either certain keyword arguments are missing (which I would assume would give a 400...) or that the prompt text generated by dspy somehow throws an error on the gpt side.
Is there currently known behavior of Azure openai GPT4 not working with DSPY?
You need to also provide deployment_id="your deployment name" in the instantiation of the client
@excubo-jg hey thanks for the response. I believe the 'model' parameter does that for me, mostly because deployment id is only used if model doesn't exist in the inputs or if I'm using openai version below 1?
Regardless I'll give it a shot and let you know!
Should we close this?
@okhat Hi Omar. This issue is making me consider not adopting DSPY for some of my experiments... not sure if you believe I should open an issue on the Azure / OpenAI side, but currently all my other setups with Azure OpenAI have been working (with litellm) and I just wanted to flag that DSPY object for azure openai seems incompatible with Azure. Do you suggest we close this?
Well, I am running DSPy with Azure Open AI...
@excubo-jg could you share the version of openai you're using, as well as info on what parameters are you using? Also, which model type are you using?
@excubo-jg I also looked into the code and deployment_id is removed if "model" parameter is provided, which it is in my case. I'm curious as to how you're getting it to work.
I'm curious as to how you're getting it to work. https://github.com/stanfordnlp/dspy/issues/686#issuecomment-2021561999
@excubo-jg I've answered that point that "model" parameter works as a viable replacement for deployment_id( in fact deployment_id gets deleted from kwargs in the dspy.AzureOpenAI module when 'model' exists) What otehr parameters do you use, and what versions of openai library are you using?
Hi @dkimmunichre I am currently testing dspy with an Azure setup. After initial problems it works now fine for me.
Params:
"api_base"
"api_version"
"api_key"
"model"
Package-versions:
dspy-ai 2.4.0
openai 1.16.1
@franperic hey! Thank you for responding. What were your issues? I'm having no issues pinging... it just errors out for some reason with a 500. What method are you using for your api_key? AD credentials?
@excubo-jg I've answered that point that "model" parameter works as a viable replacement for deployment_id( in fact deployment_id gets deleted from kwargs in the dspy.AzureOpenAI module when 'model' exists) What otehr parameters do you use, and what versions of openai library are you using?
We do agree on the fact that I am running DSPy with Azure OpenAI and you are not, don't we?
And if I do not send the deployment_id, I don't have access to Azure OpenAI.
Which is to be expected. If you were to have a look here (you are using model_type = 'chat'): https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#chat-completions
@excubo-jg I've answered that point that "model" parameter works as a viable replacement for deployment_id( in fact deployment_id gets deleted from kwargs in the dspy.AzureOpenAI module when 'model' exists) What otehr parameters do you use, and what versions of openai library are you using?
We do agree on the fact that I am running DSPy with Azure OpenAI and you are not, don't we?
And if I do not send the deployment_id, I don't have access to Azure OpenAI.
Which is to be expected. If you were to have a look here (you are using model_type = 'chat'): https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#chat-completions
I am also attempting the same setup. The logic in this script shows that what happens with your input parameters get determined by whether you're using legacy openai version, or what parameters you pass. deployment_id in this case is the same as model, and is in fact replaced by "model" before being passed into the openai library. So no that's not the fix. I've also tried it and no change in error behavior was observed
If "model" or "deployment_id" were handled differently, my error wouldn't be a 500, but a 400 or the request wouldn't even return anything because the URL would be wrong.
In my set-up a value for model is sent and it is different from the deployment_id. Openai is 1.16.2 e.g. not legacy. Per your analysis this set-up cannot work but it does. The only difference between your initial post and my set-up I can see is that I am sending a deployment_id. I do not see a problem with DSPy here.
In my set-up a value for model is sent and it is different from the deployment_id. Openai is 1.16.2 e.g. not legacy. Per your analysis this set-up cannot work but it does. The only difference between your initial post and my set-up I can see is that I am sending a deployment_id. I do not see a problem with DSPy here.
Let me see if I can get that replicate that setup. Even if somehow it works, I would say the statement "not see a problem with DSPy here" to be a bit strong if the "wrong setup" is the correct setup :D
At the very least there should be an update in the documentation or a more detailed guide to using Azure OpenAI object, which I'm willing to do once I figure this out.
Update: @excubo-jg I retried with deployment_id and it did not work. There must be other differences that allow your setup to work correctly but mine not.
Update: this was an error that stems from use of azure_ad_token as the api key. It seems that despite what the documentation says about azure_ad_token being a valid api_key argument, this does not currently work. It's strange that Azure does not return a 400, but either way, this is a bug.