server icon indicating copy to clipboard operation
server copied to clipboard

python tritonclient stream_infer should send end signal to callback

Open Jackiexiao opened this issue 2 years ago • 10 comments

this maybe a new baby question, I think we should add self._callback(result=None, error=None) after finish iteration on responses to let callback know stream response is complete and no more response will be sended. Am I right?

python triton client code

https://github.com/triton-inference-server/client/blob/d8bc5b8d4a9297e8e6b470fe96ef82a4470234a2/src/python/library/tritonclient/grpc/init.py#L2083

    def _process_response(self, responses):
        """Worker thread function to iterate through the response stream and
        executes the provided callbacks.

        Parameters
        ----------
        responses : iterator
            The iterator to the response from the server for the
            requests in the stream.

        """
        try:
            for response in responses:
                if self._verbose:
                    print(response)
                result = error = None
                if response.error_message != "":
                    error = InferenceServerException(msg=response.error_message)
                else:
                    result = InferResult(response.infer_response)
                self._callback(result=result, error=error)
            ############ add here ##############
            self._callback(result=None, error=None) # new code
        except grpc.RpcError as rpc_error:
            error = get_error_grpc(rpc_error)
            self._callback(result=None, error=error)

Jackiexiao avatar Jun 23 '22 05:06 Jackiexiao

oh.... triton grpc use bidirectional stream, not one request, stream response, so no need to add end signal

response_sender.send(flags=pb_utils.TRITONSERVER_RESPONSE_COMPLETE_FINAL)

so, how can I get end signal for decouple mode ( one request, multi response)

Jackiexiao avatar Jun 23 '22 08:06 Jackiexiao

for one request, multi response Scenario, it looks like I should need to send a None indicate send done,

        client = grpcclient.InferenceServerClient(url, verbose=False)
        client.start_stream(callback=partial(callback, user_data))
        client.async_stream_infer(
            model_name=model_name,
            inputs=triton_inputs,
            request_id=rand_int_string(),
            outputs=triton_outputs,
        )
       ############ add here ##############
        client._stream._enqueue_request(None)

so then I can use

            ############ add here ##############
            self._callback(result=None, error=None) # new code

to get a None indicate receive completed

Jackiexiao avatar Jun 23 '22 08:06 Jackiexiao

Great question! Is there a specific reason you want an end signal? You can always keep a count of expected responses vs responses from the caller like done in the decoupled examples here: https://github.com/triton-inference-server/python_backend/tree/r22.05/examples/decoupled. I don't believe end signals are generally used, except for sequence batching and the like, and that's the user sending end signals to denote that the sequence is done.

dyastremsky avatar Jun 23 '22 17:06 dyastremsky

CC: @Tabrizian. I'm assuming the code sending the requests would know the amount of expected responses, but perhaps there is a case they wouldn't.

dyastremsky avatar Jun 23 '22 17:06 dyastremsky

for example, streaming text to speech (using decouple mode), the client won't know the amount of expected responses

Jackiexiao avatar Jun 24 '22 01:06 Jackiexiao

@Jackiexiao This sounds like a reasonable request to me. I think we should deliver the flag to the client indicating whether this is the last response or not. Right now, it looks like the client must know before hand the number of responses that the model is going to send. @tanmayv25 / @jbkyang-nvi What do you think?

Tabrizian avatar Jun 24 '22 14:06 Tabrizian

We can enable this as an optional feature to the clients.

tanmayv25 avatar Jul 07 '22 01:07 tanmayv25

Wonderful. I have filed a ticket to track this feature request.

dyastremsky avatar Jul 07 '22 17:07 dyastremsky

any update?

Jackiexiao avatar Aug 26 '22 10:08 Jackiexiao

Not yet. It's in our queue.

dyastremsky avatar Aug 26 '22 17:08 dyastremsky

any update?

dkalinowski avatar Aug 08 '23 11:08 dkalinowski

Yes, this was added here. It should be available from 23.06 onwards.

Thank you for the reminder. Closing this issue.

dyastremsky avatar Aug 08 '23 17:08 dyastremsky