jina
jina copied to clipboard
Bi-directional Streaming
Describe the feature
As I understand it, the current API only supports response streaming. Is there a way to support bi-directional streaming? I imagine it would look like the current streaming API, except the input would be a generator. This would be useful for applications such as chat-bots.
Your proposal
# then define the Executor
class MyExecutor(Executor):
@requests(on='/hello')
async def task(self, docs: Generator[MyDocument], **kwargs) -> MyDocument:
for doc in docs:
yield MyDocument(text=f'{doc.text} output')
The Client already behaves as a bidirectional stream. I do not think there is much need for this feature.
@JoanFM 2 questions:
- Is this true for the
stream_doc
method only or thepost
method too? - In the example above, the server can save context. Example:
# then define the Executor
class MyExecutor(Executor):
@requests(on='/hello')
async def task(self, docs: Generator[MyDocument], **kwargs) -> MyDocument:
conversation = ""
for doc in docs:
conversation += f"\n\n{doc.text}"
yield MyDocument(text=conversation)
The benefit of this is not having to send the previous messages in a long conversation and not having to reload state. Is there a way to do that in the current API? Maybe using executor state?
I'm reading through the code, and as I understand it, ~all endpoints~ can be streaming if GRPC
or websocket
is enabled. Is this correct? The docs imply that streaming endpoints need to have a single document as the input.
Edit: I think I was wrong about it being "all endpoints"
okey ur point is that u need to keep the context in the stack making sure it arrives at the same replica?
Yes, similar to the example in the GRPC docs for bi-directional streaming:
def RouteChat(self, request_iterator, context):
prev_notes = []
for new_note in request_iterator:
for prev_note in prev_notes:
if prev_note.location == new_note.location:
yield prev_note
prev_notes.append(new_note)
I think with GRPC it may be possible, but I am not sure if in HTTP there is a way to get an stream as input.
otherwise to get this state IN, you may want to do a nice usage of Stateful Executor but this could feel overkill.
We could have the endpoint be a normal endpoint (same as if the generator was a DocList
) when the server is http and websocket.
Also, it would be useful to know how to stream bi-directionally when using the current configuration (IE, does it only work with streaming endpoints, or do normal endpoints keep the connection open too?)
I will look at it in more detail and come back
I have been thinking about this and seems interesting.
Here are my thoughts:
- Enable this only for grpc Serving.
- For local serving with grpc, the usage of
stream_rpc
should be used forgateway-worker
communication as well.
All of this is nice, but I need to confirm that in Kubernetes world, where Jina gateway job as load balancer is taken by LinkerD, the same behavior is achieved.
@jina-ai/product This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days
@jina-ai/product This issue is stale because it has been open 90 days with no activity. Remove stale label or comment or this will be closed in 14 days