weave icon indicating copy to clipboard operation
weave copied to clipboard

chore(weave): LiteLLM Support + Stream-by-flag

Open tssweeney opened this issue 9 months ago • 4 comments

We need to support a common vendor pattern: streaming is a flag, not a different symbol. This PR adds support for this case as well as support & tests for LiteLLM. Specifically:

  1. Extends the Op on_output handler to accept inputs as well to handle more advanced cases
  2. Extends the accumulator pattern to dynamically decide if accumulation is appropriate and to have a post-processor
  3. Implements the LiteLLM integration + unit tests for the integration.
  4. (unrelated, but helpful) Redacts api_key from inputs by default
Screenshot 2024-05-07 at 15 40 56

tssweeney avatar May 07 '24 15:05 tssweeney

Thanks for integrating litellm here @tssweeney curious - do you also use the proxy internally?

krrishdholakia avatar May 07 '24 15:05 krrishdholakia

Thanks for integrating litellm here @tssweeney curious - do you also use the proxy internally?

Hey @krrishdholakia! We do not use the LiteLLM proxy internally, but would be interested in learning more about the capabilities. I'm still working on this PR, but soon i'll have some screenshots to show the results of the integration

tssweeney avatar May 07 '24 21:05 tssweeney

Hey @tssweeney here's the quick start - https://docs.litellm.ai/docs/proxy/quick_start

main use-case is for loadbalancing + spend tracking across projects.

1. setup config

model_list:
  - model_name: gpt-3.5-turbo
    litellm_params:
      model: azure/<your-deployment-name>
      api_base: <your-azure-endpoint>
      api_key: <your-azure-api-key>
      rpm: 6      # Rate limit for this deployment: in requests per minute (rpm)
  - model_name: gpt-3.5-turbo
    litellm_params:
      model: azure/gpt-turbo-small-ca
      api_base: https://my-endpoint-canada-berri992.openai.azure.com/
      api_key: <your-azure-api-key>
      rpm: 6
  - model_name: gpt-3.5-turbo
    litellm_params:
      model: azure/gpt-turbo-large
      api_base: https://openai-france-1234.openai.azure.com/
      api_key: <your-azure-api-key>
      rpm: 1440

2. Start proxy

$ litellm --config /path/to/config.yaml

3. Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
      "model": "azure/gpt-turbo-small-ca",
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ],
    }
'

krrishdholakia avatar May 07 '24 21:05 krrishdholakia