clipper icon indicating copy to clipboard operation
clipper copied to clipboard

Tensorflow model container stops responding after predict function thrown exception

Open kbalka opened this issue 6 years ago • 3 comments

After passing incorrect input to predict function, I have to scale down and up the model in order to predict again.

I think the exceptions from predict function should be caught and logged, but they should not break the prediction loop.

Logs from model container:

Sent heartbeat! Received heartbeat! Sent heartbeat! Received heartbeat! Sent heartbeat! Received heartbeat! Sent heartbeat! Received heartbeat! Sent heartbeat! Received heartbeat! Got start of message 16 Traceback (most recent call last): File "/container/tf_container.py", line 110, in rpc_service.start(model, ip, port, model_name, model_version, input_type) File "/container/rpc.py", line 517, in start self.server.run(parent_conn) File "/container/rpc.py", line 309, in run prediction_request) File "/container/rpc.py", line 136, in handle_prediction_request outputs = predict_fn(prediction_request.inputs) File "/container/tf_container.py", line 52, in predict_floats preds = self.predict_func(self.sess, inputs) File "deploy_resnet.py", line 11, in predict File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 905, in run run_metadata_ptr) File "/usr/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1113, in _run str(subfeed_t.get_shape()))) ValueError: Cannot feed value of shape (1, 10) for Tensor u'input_tensor:0', which has shape '(1, 224, 224, 3)'

.. cannot predict anymore

kbalka avatar Apr 11 '18 09:04 kbalka

Ahh this is an excellent point. Thanks for bringing this to our attention.

dcrankshaw avatar Apr 15 '18 05:04 dcrankshaw

@dcrankshaw do you think it makes sense to actually link it with readiness probes

santi81 avatar May 10 '18 22:05 santi81

@simon-mo Has it been addressed?

rkooo567 avatar Jun 05 '19 03:06 rkooo567