litellm [Bug]: Success callback not called for async streaming

What happened?

The success callback is never called. All this code taken from the docs:

import litellm
from litellm import acompletion
import asyncio
import traceback

# track_cost_callback
def track_cost_callback(
    kwargs,                 # kwargs to completion
    completion_response,    # response from completion
    start_time, end_time    # start/end time
):
    try:
      response_cost = kwargs.get("response_cost", 0)
      print("streaming response_cost", response_cost)
    except:
        pass
# set callback
litellm.success_callback = [track_cost_callback] # set custom callback function

async def completion_call():
    try:
        print("test acompletion + streaming")
        response = await acompletion(
            model="claude-3-haiku-20240307",
            messages=[{"content": "Hello, how are you?", "role": "user"}],
            stream=True
        )
        print(f"response: {response}")
        async for chunk in response:
            print(chunk)
    except:
        print(f"error occurred: {traceback.format_exc()}")
        pass

asyncio.run(completion_call())

Relevant log output

(venv) MacBook-Pro-6 $ python ./test.py 
test acompletion + streaming
response: <litellm.utils.CustomStreamWrapper object at 0x110ebe010>
ModelResponse(id='chatcmpl-e3412047-eaa1-467a-9adf-efa5d1e1ad5a', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(content='Hello', role='assistant', function_call=None, tool_calls=None), logprobs=None)], created=1734378237, model='claude-3-haiku-20240307', object='chat.completion.chunk', system_fingerprint=None)
...
rprint=None)
ModelResponse(id='chatcmpl-e3412047-e1a1-467a-9adf-efa5d3e1ad5a', choices=[StreamingChoices(finish_reason=None, index=0, delta=Delta(content='?', role=None, function_call=None, tool_calls=None), logprobs=None)], created=1734378238, model='claude-3-haiku-20240307', object='chat.completion.chunk', system_fingerprint=None)
ModelResponse(id='chatcmpl-e1412047-eaa1-467a-9adf-efa5d3e1ad5a', choices=[StreamingChoices(finish_reason='stop', index=0, delta=Delta(content=None, role=None, function_call=None, tool_calls=None), logprobs=None)], created=1734378238, model='claude-3-haiku-20240307', object='chat.completion.chunk', system_fingerprint=None)

Are you a ML Ops Team?

No

What LiteLLM version are you on ?

v1.55.3

Twitter / LinkedIn details

No response

Dec 16 '24 19:12 boosh

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

Mar 18 '25 00:03 github-actions[bot]

Mar 18 '25 08:03 boosh