litellm icon indicating copy to clipboard operation
litellm copied to clipboard

Add in Bedrock Mistral Streaming fix for litellm proxy

Open sean-bailey opened this issue 11 months ago • 5 comments

With V1.30.3, the internal litellm response operations supported streaming, however doing any OpenAI API calls for streaming returned empty responses with the litellm proxy. Upon further inspection, it shows that the response being sent by Bedrock for the mistral models is found in chunk_data['outputs'][0]['text']. In the utils, for handling the bedrock stream, this PR adds in a condition for Bedrock Mistral formatted streaming.

sean-bailey avatar Mar 08 '24 18:03 sean-bailey

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
litellm ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 11, 2024 3:34pm
litellm-dashboard ✅ Ready (Inspect) Visit Preview 💬 Add feedback Mar 11, 2024 3:34pm

vercel[bot] avatar Mar 08 '24 18:03 vercel[bot]

@sean-bailey could you add a test for this here - https://github.com/BerriAI/litellm/blob/713f5991b8528a311b878886a2c455e68d639077/litellm/tests/test_bedrock_completion.py#L4

Bonus if you can attach a screenshot of it working for you.


Side note: dm'ed on Linkedin to learn how you're using the proxy!

Would love to chat if you have ~10 mins this/next week? https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

krrishdholakia avatar Mar 08 '24 21:03 krrishdholakia

@sean-bailey could you add a test for this here -

https://github.com/BerriAI/litellm/blob/713f5991b8528a311b878886a2c455e68d639077/litellm/tests/test_bedrock_completion.py#L4

Bonus if you can attach a screenshot of it working for you.

Side note: dm'ed on Linkedin to learn how you're using the proxy!

Would love to chat if you have ~10 mins this/next week? https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat

I don't see other tests for streaming in that file, but I can provide the openAI compatible code I used to get streaming to work with the proxy.

from openai import OpenAI
endpointUrl="http://localhost:8000/v1" 
promptmessage = "What is the capital of France?" 
yourAPIKey="gsdfgsdfg" 
agentprompt="You are a helpful assistant."
prompt_message=promptmessage
client=OpenAI(api_key=yourAPIKey,base_url=endpointUrl)
stream = client.chat.completions.create(
    model="mixtral-8x7b-instruct-v0:1",
    messages= [
            {"content": agentprompt, "role": "system"},
            {"content": prompt_message, "role": "user"},
        ],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content,end="")

You should see something similar to a streamed output of

The capital of France is Paris. Paris is a major European city and a global center for art, fashion, gastronomy, and culture. It is located along the Seine River, in the north of France. The city is divided into 20 arrondissements, or districts, and is well known for its beautiful architecture, museums, and landmarks such as the Eiffel Tower, the Louvre Museum, the Notre-Dame Cathedral, and the Palace of Versailles. Paris is also home to many prestigious universities and research institutions, making it a hub for education and innovation.

Streaming would look better on video, but this code should be repeatable to test with, unless you'd like that inside of the test file.

The config file I used for running this locally was pretty straightforward:

model_list:
  - model_name: mixtral-8x7b-instruct-v0:1
    litellm_params: 
      model: "bedrock/mistral.mixtral-8x7b-instruct-v0:1"
      aws_region_name: "us-west-2"

litellm_settings: # module level litellm settings - https://github.com/BerriAI/litellm/blob/main/litellm/__init__.py
  drop_params: True
  set_verbose: True

doing a pip install litellm[proxy] to set things up, and running it with litellm --config config.yaml

sean-bailey avatar Mar 08 '24 22:03 sean-bailey

Related: #2464

GlavitsBalazs avatar Mar 14 '24 16:03 GlavitsBalazs

Will review this and either merge the pr or push a fix for the issue this week @sean-bailey @GlavitsBalazs

krrishdholakia avatar Mar 19 '24 01:03 krrishdholakia

Hey @sean-bailey @GlavitsBalazs this should be fixed in v1.34.13

Let me know if this error persists for y'all

krrishdholakia avatar Mar 29 '24 22:03 krrishdholakia