dspy icon indicating copy to clipboard operation
dspy copied to clipboard

500 internal error when calling Azure OpenAI GPT4 with dspy

Open dkimmunichre opened this issue 11 months ago • 20 comments

Hello all!

Currently experiencing 500 internal errors when using DSPY with Azure OpenAI gpt-4.

I have been using the Azure OpenAI client to ping a gpt-4 deployment, and am consistently getting: Internal service error 500. I've also pinged using litellm for a manual prompting pipeline I've set up alongside it and have not seen any issues / errors returned. indicating it's not an Azure side issue - nor a ratelimit issue since I did them right back to back but not together.

I've checked that both dspy's azure openai client and my litellm pipeline correctly sends a message with keys 'role' and 'content'.. I can't seem to find the issue.

The signature and module setup I have is as follows:

class QuestionAnswerSignature(dspy.Signature):
    context: str = dspy.InputField()
    question: str = dspy.InputField()
    choices: list[str] = dspy.InputField(format=list)
    answer: QA_output = dspy.OutputField(desc="key-value pairs")
class QuestionAnswer(dspy.Module):
    def __init__(self):
        super().__init__()
        self.pick_answer = dspy.Predict(QuestionAnswerSignature)
    
    def forward(self, context, question, choices):

        output = self.pick_answer(
            context = context,
            question=question,  
            choices=choices
        )

        # suggest that length should be less than 5 tokens
        dspy.Assert(len(output) < 100)

        return output

And the actual instantiation of the client I have is as shown below.

gpt_dspy = AzureOpenAI(
            model=model_name, 
            api_base = gpt_client.api_base,
            api_version = gpt_client.api_version,
            api_key = gpt_client._get_token(),
            model_type = 'chat',
)
dspy.settings.configure(lm=gpt_dspy)

The request is sent correctly to the right address, as can be seen by the 500 + the logs also indicate the right URL is being used.

The only other thing I can think of is that either certain keyword arguments are missing (which I would assume would give a 400...) or that the prompt text generated by dspy somehow throws an error on the gpt side.

Is there currently known behavior of Azure openai GPT4 not working with DSPY?

dkimmunichre avatar Mar 20 '24 17:03 dkimmunichre

You need to also provide deployment_id="your deployment name" in the instantiation of the client

excubo-jg avatar Mar 26 '24 22:03 excubo-jg

@excubo-jg hey thanks for the response. I believe the 'model' parameter does that for me, mostly because deployment id is only used if model doesn't exist in the inputs or if I'm using openai version below 1?

dkimmunichre avatar Mar 27 '24 00:03 dkimmunichre

Regardless I'll give it a shot and let you know!

dkimmunichre avatar Mar 27 '24 00:03 dkimmunichre

Should we close this?

okhat avatar Apr 03 '24 17:04 okhat

@okhat Hi Omar. This issue is making me consider not adopting DSPY for some of my experiments... not sure if you believe I should open an issue on the Azure / OpenAI side, but currently all my other setups with Azure OpenAI have been working (with litellm) and I just wanted to flag that DSPY object for azure openai seems incompatible with Azure. Do you suggest we close this?

dkimmunichre avatar Apr 03 '24 18:04 dkimmunichre

Well, I am running DSPy with Azure Open AI...

excubo-jg avatar Apr 03 '24 21:04 excubo-jg

@excubo-jg could you share the version of openai you're using, as well as info on what parameters are you using? Also, which model type are you using?

dkimmunichre avatar Apr 03 '24 21:04 dkimmunichre

@excubo-jg I also looked into the code and deployment_id is removed if "model" parameter is provided, which it is in my case. I'm curious as to how you're getting it to work.

dkimmunichre avatar Apr 03 '24 21:04 dkimmunichre

I'm curious as to how you're getting it to work. https://github.com/stanfordnlp/dspy/issues/686#issuecomment-2021561999

excubo-jg avatar Apr 03 '24 22:04 excubo-jg

@excubo-jg I've answered that point that "model" parameter works as a viable replacement for deployment_id( in fact deployment_id gets deleted from kwargs in the dspy.AzureOpenAI module when 'model' exists) What otehr parameters do you use, and what versions of openai library are you using?

dkimmunichre avatar Apr 04 '24 14:04 dkimmunichre

Hi @dkimmunichre I am currently testing dspy with an Azure setup. After initial problems it works now fine for me.

Params:

"api_base"
"api_version"
"api_key"
"model"

Package-versions:

dspy-ai                   2.4.0  
openai                    1.16.1  

franperic avatar Apr 04 '24 16:04 franperic

@franperic hey! Thank you for responding. What were your issues? I'm having no issues pinging... it just errors out for some reason with a 500. What method are you using for your api_key? AD credentials?

dkimmunichre avatar Apr 04 '24 17:04 dkimmunichre

@excubo-jg I've answered that point that "model" parameter works as a viable replacement for deployment_id( in fact deployment_id gets deleted from kwargs in the dspy.AzureOpenAI module when 'model' exists) What otehr parameters do you use, and what versions of openai library are you using?

We do agree on the fact that I am running DSPy with Azure OpenAI and you are not, don't we?

And if I do not send the deployment_id, I don't have access to Azure OpenAI.

Which is to be expected. If you were to have a look here (you are using model_type = 'chat'): https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#chat-completions

excubo-jg avatar Apr 04 '24 18:04 excubo-jg

@excubo-jg I've answered that point that "model" parameter works as a viable replacement for deployment_id( in fact deployment_id gets deleted from kwargs in the dspy.AzureOpenAI module when 'model' exists) What otehr parameters do you use, and what versions of openai library are you using?

We do agree on the fact that I am running DSPy with Azure OpenAI and you are not, don't we?

And if I do not send the deployment_id, I don't have access to Azure OpenAI.

Which is to be expected. If you were to have a look here (you are using model_type = 'chat'): https://learn.microsoft.com/en-us/azure/ai-services/openai/reference#chat-completions

I am also attempting the same setup. The logic in this script shows that what happens with your input parameters get determined by whether you're using legacy openai version, or what parameters you pass. deployment_id in this case is the same as model, and is in fact replaced by "model" before being passed into the openai library. So no that's not the fix. I've also tried it and no change in error behavior was observed

dkimmunichre avatar Apr 04 '24 18:04 dkimmunichre

If "model" or "deployment_id" were handled differently, my error wouldn't be a 500, but a 400 or the request wouldn't even return anything because the URL would be wrong.

dkimmunichre avatar Apr 04 '24 18:04 dkimmunichre

In my set-up a value for model is sent and it is different from the deployment_id. Openai is 1.16.2 e.g. not legacy. Per your analysis this set-up cannot work but it does. The only difference between your initial post and my set-up I can see is that I am sending a deployment_id. I do not see a problem with DSPy here.

excubo-jg avatar Apr 04 '24 19:04 excubo-jg

In my set-up a value for model is sent and it is different from the deployment_id. Openai is 1.16.2 e.g. not legacy. Per your analysis this set-up cannot work but it does. The only difference between your initial post and my set-up I can see is that I am sending a deployment_id. I do not see a problem with DSPy here.

Let me see if I can get that replicate that setup. Even if somehow it works, I would say the statement "not see a problem with DSPy here" to be a bit strong if the "wrong setup" is the correct setup :D

dkimmunichre avatar Apr 04 '24 19:04 dkimmunichre

At the very least there should be an update in the documentation or a more detailed guide to using Azure OpenAI object, which I'm willing to do once I figure this out.

dkimmunichre avatar Apr 04 '24 19:04 dkimmunichre

Update: @excubo-jg I retried with deployment_id and it did not work. There must be other differences that allow your setup to work correctly but mine not.

dkimmunichre avatar Apr 08 '24 18:04 dkimmunichre

Update: this was an error that stems from use of azure_ad_token as the api key. It seems that despite what the documentation says about azure_ad_token being a valid api_key argument, this does not currently work. It's strange that Azure does not return a 400, but either way, this is a bug.

dkimmunichre avatar May 06 '24 14:05 dkimmunichre