guidance
guidance copied to clipboard
Streaming example
Can an example of streaming output as a generator be added? My use case is replacing langchain in production for a QA system.
As a follow up - it would be great if the library allowed doing something like the following without blocking the event loop by internally calling openai.ChatCompletion.acreate
instead of create
.
import asyncio
import guidance
guidance.llm = guidance.llms.OpenAI(model='gpt-3.5-turbo')
program = guidance("""
{{#user~}}
Write A poem about sparkling water
{{~/user}}
{{#assistant~}}
{{gen 'poem'}}
{{~/assistant}}
""")
async def counter():
for x in range(5):
print(x, flush=True)
await asyncio.sleep(1)
async def generate():
# Currently unsupported
async for chunk in program(async_mode=True, stream=True):
print(chunk, flush=True)
async def main():
await asyncio.gather(generate(), counter())
asyncio.run(main())
I've got a working workaround here https://github.com/andaag/chattergpt/blob/e199ccc99b7d8476275f4ed5720b268d506490c1/shared.py#L41 - it's a bit ugly as it accesses internal classes, but should work fine until something is implemented.
This is indeed much needed. Will post when this is pushed.
I've got a working workaround here https://github.com/andaag/chattergpt/blob/e199ccc99b7d8476275f4ed5720b268d506490c1/shared.py#L41 - it's a bit ugly as it accesses internal classes, but should work fine until something is implemented.
Thanks for the link, using the internal done function is what I was missing to get hacky streaming working with FastAPI
This is indeed much needed. Will post when this is pushed.
Thank you @slundberg . The guidance
library has tremendous potential for controlling nuanced chatbot behavior vs. langchain and we appreciate your efforts.
0.0.56 now has a first version of streaming support. We will need to also update library calls to update variables more frequently, but any comments or feedback are welcome over in #129
Closing for now, feel free to reopen if needed.