seldon-core Model gateway silently ignores max payload size error

Model gateway silently ignores max payload size error

Open adriangonz opened this issue 2 years ago • 0 comments

Describe the bug

When the payload exceeds the broker's max size (i.e. message.max.bytes), the model gateway may print an error on the logs but then silently ignore it. Upstream, since the pipeline gateway doesn't know anything about that error, it will keep on waiting for a response back and eventually time out.

To reproduce

Deploy a dummy model that returns large payloads (i.e. exceeding the producer's and broker's message.max.bytes), e.g.

import numpy as np

from mlserver import MLModel
from mlserver.codecs import NumpyCodec
from mlserver.types import InferenceRequest, InferenceResponse


class SlowModel(MLModel):
    async def predict(self, payload: InferenceRequest) -> InferenceResponse:	
        pred = np.random.rand(3, 1024, 1024)
        return InferenceResponse(
            model_name=self.name,
            model_version=self.version,
            outputs=[
                NumpyCodec.encode_output("foo", pred)
            ]
        )

Put the model above into a pipeline (to make sure requests go through the model gateway).
Enable debug: all in the Kafka settings.
Send a request via the pipeline, and observe it eventually times out (may depend on environment, but it times out after exactly 5 min in my case).

Check out the model gateway logs, and see an error similar to the one below:

%7|1693583635.168|MSGSET|rdkafka#producer-1| [thrd:thinkpad.localdomain:9092/1001]: thinkpad.localdomain:9092/1001: seldon.default.model.slow-model.outputs [0]: MessageSet with 1 message(s) (MsgId 0, BaseSeq -1) encountered error: Broker: Message size too large (actions Permanent,MsgNotPersisted)

Expected behaviour

The model gateway should send back an error upstream that makes it clear this fails due to the Kafka settings.

Sep 05 '23 08:09 adriangonz

seldon-core seldon-core copied to clipboard

Model gateway silently ignores max payload size error

Describe the bug

To reproduce

Expected behaviour

seldon-core
seldon-core copied to clipboard