dspy icon indicating copy to clipboard operation
dspy copied to clipboard

Fix multiple openai clients

Open rawsh opened this issue 4 months ago • 3 comments

Since the OpenAI module is setting parameters on the base openai import, it's not possible to use multiple different models that use an openai compatible api (e.g. both a local ollama model server and gpt4).

Manual testing:

import dspy
llama_client = dspy.OpenAI(
    api_base="https://api.fireworks.ai/inference/v1/",
    model="accounts/fireworks/models/llama-v2-70b-chat",
    model_type="chat",
    api_key=API_KEY
)
mixtral_client = dspy.OpenAI(
    api_base="https://api.fireworks.ai/inference/v1/",
    model="accounts/fireworks/models/mixtral-8x7b-instruct",
    model_type="chat",
    api_key=API_KEY
)
print(llama_client.basic_request("hello")["model"])
print(mixtral_client.basic_request("hello")["model"])
HTTP Request: POST https://api.fireworks.ai/inference/v1/chat/completions "HTTP/1.1 200 OK"
accounts/fireworks/models/llama-v2-70b-chat
HTTP Request: POST https://api.fireworks.ai/inference/v1/chat/completions "HTTP/1.1 200 OK"
accounts/fireworks/models/mixtral-8x7b-instruct

Cached output (no request is sent)

print(llama_client.basic_request("hello")["model"])
print(mixtral_client.basic_request("hello")["model"])
accounts/fireworks/models/llama-v2-70b-chat
accounts/fireworks/models/mixtral-8x7b-instruct

rawsh avatar Mar 31 '24 01:03 rawsh

Hi @rawsh , thanks for the contributions! I believe this may break the cache but tagging @okhat for reference here.

Would it be possible to not touch the cache functions to make this change?

arnavsinghvi11 avatar Apr 08 '24 18:04 arnavsinghvi11

@arnavsinghvi11 I think without the cache modifications caching for different models would be broken (e.g. same prompt on gpt3.5 could pull in cached responses from gp4). Would gladly take any suggestions on how to improve it though. Is the main problem existing caches would break?

rawsh avatar Apr 12 '24 00:04 rawsh

yeah ideally if we can add this support without modifying the existing behavior, this won't break the existing caches. Curious if you can only add the client creation as a separate flag, while keeping the rest of the cache function behavior the same. dspy.Databricks might help in this direction since it's a wrapper around the GPT3 class.

arnavsinghvi11 avatar Apr 13 '24 01:04 arnavsinghvi11