FastChat icon indicating copy to clipboard operation
FastChat copied to clipboard

Feature - support stream for `ChatCompletion`

Open baradm100 opened this issue 1 year ago • 2 comments

Adding support for stream for ChatCompletion in the client and in the API.

How to use?

Async

from fastchat import client


async def async_main():
    model_name = "vicuna-7b"

    res = await client.ChatCompletion.acreate(
        model=model_name,
        messages=[{"role": "user", "content": "Tell me a story with more than 1000 words."}],
        temperature=0.0,
        max_tokens=32,
        stream=True,
    )
    async for chunk in res:
        content = chunk.choices[0].delta.content
        if content is not None:
            print(content, end="")

if __name__ == "__main__":
    asyncio.run(async_main())

Sync

from fastchat import client
def sync_main():
    model_name = "vicuna-7b"

    res = client.ChatCompletion.create(
        model=model_name,
        messages=[{"role": "user", "content": "Tell me a story with more than 1000 words."}],
        temperature=0.0,
        max_tokens=32,
        stream=True,
    )
    for chunk in res:
        content = chunk.choices[0].delta.content
        if content is not None:
            print(content, end="")

if __name__ == "__main__":
    sync_main()

Issue

Fix #569

baradm100 avatar May 05 '23 11:05 baradm100

hi @baradm100 i dont understand why this PR has to change so many files. i didn't see this implementation when i set out to implement streaming today (shipped in #873), but comparing my work to yours, it appears you are editing way more files. Why?

kfatehi avatar May 06 '23 21:05 kfatehi

Hi @kfatehi !

  1. I ran the format.sh file, as recommended
  2. I added fastchat/serve/test_stream.py as per the standard to have a file to test a new flow
  3. Using BaseModel for the data to standardize and validate the input/output
  4. Added support for the SDK as well (for create and acreate)
  5. All the logic that I added that is not needed in the class I moved for helper files

Regarding the re-formating, I know it's intimidating, and I can revert the reformat.

baradm100 avatar May 07 '23 06:05 baradm100

@baradm100 Thanks for implementing this and rebasing. I will do some local testing and style change and then merge this soon.

merrymercy avatar May 08 '23 11:05 merrymercy

@baradm100 This PR is merged! I made some style changes. If you have better ideas, please submit subsequent PRs!

merrymercy avatar May 08 '23 13:05 merrymercy

is it tested compatible with openai.ChatCompletion module ? No issue in non streaming mode but can't get messages in streaming...

XReyRobert avatar May 08 '23 18:05 XReyRobert

The stream implementation has some inconsistencies with OpenAI's, and I'm fixing those in PR #818.

jstzwj avatar May 08 '23 18:05 jstzwj

is it tested compatible with openai.ChatCompletion module ?

No issue in non streaming mode but can't get messages in streaming...

I used curl for my test. Do you mind sharing your test code?

kfatehi avatar May 08 '23 18:05 kfatehi