dspy
dspy copied to clipboard
OpenAI v1 Migration
Migrated OpenAI GPT3 LM client to v1's OpenAI/AzureOpenAI clients. The previous version, had kwarg inconsistencies with Azure calls.
Fixes #377
Thank you so much @KCaverly ! I really appreciate your contribution. I left some comments but the overall summary is: can we actually just have an AzureOpenAI client that's separate from the current "GPT3" class? That way there are no breaking changes for non-Azure users.
Thank you so much @KCaverly ! I really appreciate your contribution. I left some comments but the overall summary is: can we actually just have an AzureOpenAI client that's separate from the current "GPT3" class? That way there are no breaking changes for non-Azure users.
I agree, we need to separate Azure from the rest; it's safer and also more maintainable
A separate AzureOpenAI class is much cleaner. What I can do is revert the GPT3 changes, pull azure out so GPT3 is exclusively for OpenAI and introduce a new AzureOpenAI LM.
A separate AzureOpenAI class is much cleaner. What I can do is revert the GPT3 changes, pull azure out so GPT3 is exclusively for OpenAI and introduce a new AzureOpenAI LM.
I would say it's better to delegate these LM abstractions to other libraries (that take care of legacy versions as well, like LangChain or LlamaIndex); also, this removes the burden of maintenance from DSPy contributors who should focus on improving DSPy concepts and features, not the "accessories/devices" around it. DSPy should not reinvent these LM abstractions, it should just expose its adapters to connect to them
I split out the Azure functionality into its own class. I agree that it makes more sense to abstract over LMs to some sort of external package. In the meantime, the concerns above should be resolved with these changes.
The caching methods have not been changed, and this should not be breaking at all for existing non-Azure users. The only difference is api_provider as an argument is no longer necessary, however, I have left it in, so we can redirect users who may have Azure api calls to the new class.
Overall, LGTM. I made minor comments on the PR.
Two questions about this PR and one general question:
-
I am using Azure APIs. If you think it is useful, I'd be happy to test the PR.
-
Would you add the Azure OpenAI entry to this file as well to help users?
docs/language_models_client.md
. -
We have seen more people support OpenAI compatible APIs, such that OpenAI APIs can be used instead of using different client modules. I am posing a question: are there enough differences between OpenAI and Azure APIs that we need to have a separate module? I have seen a couple of other software libraries have separate modules for Azure APIs and OpenAI APIs, but I am still asking questions.
Thanks @insop for the review.
I have updated the docs/and a few of the comments above to ensure consistency. If you have access to the AzureOpenAI I would appreciate you doing a quick test.
On 3: We walked through why we broke Azure OpenAI out above, but at a high level. Given the new v1 OpenAI api, I don't think Azure can be integrated into the current OpenAI class, without invalidating all existing caches, which is something we definitely do not want to do.
Hey, just gave this a quick try using the latest OpenAI azure version. Great work! There are two small issues:
- Had to fix the typo highlighted above (otherwise got an error about
Chat
having nocompletion
attribute) but otherwise it went smoothly. - It's not safe to infer
model_type
based on the model name for azure, since everyone will have different deployment names. Given that the wide majority of use nowadays is for chat model (few people use 3.5-instruct, anddavinci-003
isn't available on Azure), I think it'd make sense to change:
default_model_type = (
"chat"
if ("gpt-3.5" in model or "turbo" in model or "gpt-4" in model)
and ("instruct" not in model)
else "text"
)
to just having model_type
assume the default value of chat
, with a note in the documentation to manually change it to text
if using an instruct model?
Thanks @bclavie for giving it a test, adjusted both the model_type and typo fix.
@KCaverly Thank you for the update, I have not given a try, but since @bclavie has done the test. I will skip my test.
@okhat DSPy is used for this course, I think it would be great if we could consider followings before merging this PR:
- Adding a git tag the repo, so that the repo that uses older OpenAI API can be checked out. So that the course notebook will work on colab environment simply clone the repo with the tag
- Identifying a specific DSPy version that works with openai older API so that it can be used for the pip install? so that the course notebook and the grader still works?
Lots of cool stuff here - thank you all. Just to be clear @insop old openai is still fully supported!
See intro.ipynb — it installs openai 0.28.1 and it works with very old caches
The main test I need to merge this is to ensure that existing usage of openai is completely unaffected. I am only worried about the azure file using all the same names for functions and then importing with * so I worry about polluting the namespace with the wrong functions. The cache functions are named the same
Thats a great catch @okhat, I've updated the imports to just pull AzureOpenAI, so none of the cache functions should be available beyond direct importing from the file.
Lots of cool stuff here - thank you all. Just to be clear @insop old openai is still fully supported!
See intro.ipynb — it installs openai 0.28.1 and it works with very old caches
Sounds good, I just want to make sure to avoid any disturbing changes for the existing users.
That said, for adding a git tag and a description with version number for non-trivial changes are good things for the current users.
Hey all, is there anything outstanding on this being merged?
Hey @KCaverly nothing outstanding, merged :D Thanks a ton!
Thank you for this :D