modelmesh-serving icon indicating copy to clipboard operation
modelmesh-serving copied to clipboard

ExecutionBatchError: Failed "execute_batch":

Open MLHafizur opened this issue 1 year ago • 2 comments

We have deployed models using MLServer Custom Runtime. Getting the following error during inferencing:

Task exception was never retrieved
future: <Task finished name='Task-18' coro=<<coroutine without __name__>()> exception=ExecuteBatchError('Failed "execute_batch": (<grpc._cython.cygrpc.SendInitialMetadataOperation object at 0x7fa530536540>, <grpc._cython.cygrpc.SendStatusFromServerOperation object at 0x7fa43c4c2a00>)')>
Traceback (most recent call last):
  File "src/python/grpcio/grpc/_cython/_cygrpc/aio/server.pyx.pxi", line 719, in _handle_exceptions
  File "src/python/grpcio/grpc/_cython/_cygrpc/aio/callback_common.pyx.pxi", line 184, in _send_error_status_from_server
  File "src/python/grpcio/grpc/_cython/_cygrpc/aio/callback_common.pyx.pxi", line 98, in execute_batch
grpc._cython.cygrpc.ExecuteBatchError: Failed "execute_batch": (<grpc._cython.cygrpc.SendInitialMetadataOperation object at 0x7fa530536540>, <grpc._cython.cygrpc.SendStatusFromServerOperation object at 0x7fa43c4c2a00>)

Never seen this error before, any idea?

MLHafizur avatar Mar 10 '23 01:03 MLHafizur

I found a couple issues in the gRPC repo that look relevant:

  • https://github.com/grpc/grpc/issues/31570
  • https://github.com/grpc/grpc/issues/30984

From the later comments on the second issue it sounds like it could be related to a client side disconnect during process of a gRPC request with the new AsyncIO API.

tjohnson31415 avatar Mar 10 '23 18:03 tjohnson31415

@MLHafizur this might indicate that the MM and/or adapter containers restarted, could you check whether that's the case?

njhill avatar Mar 10 '23 22:03 njhill