FastChat
FastChat copied to clipboard
Feature - support stream for `ChatCompletion`
Adding support for stream for ChatCompletion
in the client and in the API.
How to use?
Async
from fastchat import client
async def async_main():
model_name = "vicuna-7b"
res = await client.ChatCompletion.acreate(
model=model_name,
messages=[{"role": "user", "content": "Tell me a story with more than 1000 words."}],
temperature=0.0,
max_tokens=32,
stream=True,
)
async for chunk in res:
content = chunk.choices[0].delta.content
if content is not None:
print(content, end="")
if __name__ == "__main__":
asyncio.run(async_main())
Sync
from fastchat import client
def sync_main():
model_name = "vicuna-7b"
res = client.ChatCompletion.create(
model=model_name,
messages=[{"role": "user", "content": "Tell me a story with more than 1000 words."}],
temperature=0.0,
max_tokens=32,
stream=True,
)
for chunk in res:
content = chunk.choices[0].delta.content
if content is not None:
print(content, end="")
if __name__ == "__main__":
sync_main()
Issue
Fix #569
hi @baradm100 i dont understand why this PR has to change so many files. i didn't see this implementation when i set out to implement streaming today (shipped in #873), but comparing my work to yours, it appears you are editing way more files. Why?
Hi @kfatehi !
- I ran the
format.sh
file, as recommended - I added
fastchat/serve/test_stream.py
as per the standard to have a file to test a new flow - Using BaseModel for the data to standardize and validate the input/output
- Added support for the SDK as well (for
create
andacreate
) - All the logic that I added that is not needed in the class I moved for helper files
Regarding the re-formating, I know it's intimidating, and I can revert the reformat.
@baradm100 Thanks for implementing this and rebasing. I will do some local testing and style change and then merge this soon.
@baradm100 This PR is merged! I made some style changes. If you have better ideas, please submit subsequent PRs!
is it tested compatible with openai.ChatCompletion module ? No issue in non streaming mode but can't get messages in streaming...
The stream implementation has some inconsistencies with OpenAI's, and I'm fixing those in PR #818.
is it tested compatible with openai.ChatCompletion module ?
No issue in non streaming mode but can't get messages in streaming...
I used curl for my test. Do you mind sharing your test code?