weave chore(weave): LiteLLM Support + Stream-by-flag

We need to support a common vendor pattern: streaming is a flag, not a different symbol. This PR adds support for this case as well as support & tests for LiteLLM. Specifically:

Extends the Op on_output handler to accept inputs as well to handle more advanced cases
Extends the accumulator pattern to dynamically decide if accumulation is appropriate and to have a post-processor
Implements the LiteLLM integration + unit tests for the integration.
(unrelated, but helpful) Redacts api_key from inputs by default

May 07 '24 15:05 tssweeney

Preview this PR with FeatureBee: https://beta.wandb.ai/?betaVersion=0938bc0c3ecaa3a7ca74e9ac3ffc06d983a0d926

May 07 '24 15:05 circle-job-mirror[bot]

Thanks for integrating litellm here @tssweeney curious - do you also use the proxy internally?

May 07 '24 15:05 krrishdholakia

Thanks for integrating litellm here @tssweeney curious - do you also use the proxy internally?

Hey @krrishdholakia! We do not use the LiteLLM proxy internally, but would be interested in learning more about the capabilities. I'm still working on this PR, but soon i'll have some screenshots to show the results of the integration

May 07 '24 21:05 tssweeney

Hey @tssweeney here's the quick start - https://docs.litellm.ai/docs/proxy/quick_start

main use-case is for loadbalancing + spend tracking across projects.

1. setup config

model_list:
  - model_name: gpt-3.5-turbo
    litellm_params:
      model: azure/<your-deployment-name>
      api_base: <your-azure-endpoint>
      api_key: <your-azure-api-key>
      rpm: 6      # Rate limit for this deployment: in requests per minute (rpm)
  - model_name: gpt-3.5-turbo
    litellm_params:
      model: azure/gpt-turbo-small-ca
      api_base: https://my-endpoint-canada-berri992.openai.azure.com/
      api_key: <your-azure-api-key>
      rpm: 6
  - model_name: gpt-3.5-turbo
    litellm_params:
      model: azure/gpt-turbo-large
      api_base: https://openai-france-1234.openai.azure.com/
      api_key: <your-azure-api-key>
      rpm: 1440

2. Start proxy

$ litellm --config /path/to/config.yaml

3. Test it!

curl --location 'http://0.0.0.0:4000/chat/completions' \
--header 'Content-Type: application/json' \
--data ' {
      "model": "azure/gpt-turbo-small-ca",
      "messages": [
        {
          "role": "user",
          "content": "what llm are you"
        }
      ],
    }
'

May 07 '24 21:05 krrishdholakia

weave weave copied to clipboard

chore(weave): LiteLLM Support + Stream-by-flag

1. setup config

2. Start proxy

3. Test it!

weave
weave copied to clipboard