seldon-core
seldon-core copied to clipboard
Model gateway silently ignores max payload size error
Describe the bug
When the payload exceeds the broker's max size (i.e. message.max.bytes), the model gateway may print an error on the logs but then silently ignore it. Upstream, since the pipeline gateway doesn't know anything about that error, it will keep on waiting for a response back and eventually time out.
To reproduce
- Deploy a dummy model that returns large payloads (i.e. exceeding the producer's and broker's
message.max.bytes), e.g.import numpy as np from mlserver import MLModel from mlserver.codecs import NumpyCodec from mlserver.types import InferenceRequest, InferenceResponse class SlowModel(MLModel): async def predict(self, payload: InferenceRequest) -> InferenceResponse: pred = np.random.rand(3, 1024, 1024) return InferenceResponse( model_name=self.name, model_version=self.version, outputs=[ NumpyCodec.encode_output("foo", pred) ] ) - Put the model above into a pipeline (to make sure requests go through the model gateway).
- Enable
debug: allin the Kafka settings. - Send a request via the pipeline, and observe it eventually times out (may depend on environment, but it times out after exactly 5 min in my case).
- Check out the model gateway logs, and see an error similar to the one below:
%7|1693583635.168|MSGSET|rdkafka#producer-1| [thrd:thinkpad.localdomain:9092/1001]: thinkpad.localdomain:9092/1001: seldon.default.model.slow-model.outputs [0]: MessageSet with 1 message(s) (MsgId 0, BaseSeq -1) encountered error: Broker: Message size too large (actions Permanent,MsgNotPersisted)
Expected behaviour
The model gateway should send back an error upstream that makes it clear this fails due to the Kafka settings.