serve icon indicating copy to clipboard operation
serve copied to clipboard

Request level http status codes in batch

Open msaroufim opened this issue 3 years ago • 1 comments

@lxning @sharvil

Request level error codes and batch responses

Right now the way we handle batch responses is all clients get the same inference for a batch. This is confusing for many reasons

  1. Clients can't trust that http 200 code means that their example in a batch was successful
  2. Requires serve users to demux error logs meant for a specific client

Proposed solution

Middleware

One solution is for a middleware layer that would tag each request for a client, parse it in the preprocess handler and make sure that the same order is maintained in the post-process handler. This works but is annoying and doesn't solve the confusion around why is each client getting a 200 error. Over time this middleware would need to do a lot

Refactor BatchAggregator

https://github.com/pytorch/serve/blob/master/frontend/server/src/main/java/org/pytorch/serve/wlm/BatchAggregator.java#L58

The current batch aggregator returns only a single message code public void sendResponse(ModelWorkerResponse message) but ideally this should return a list list

This may be a bit tricky to implement since ModelWorkerResponse also doesn't work with Lists https://github.com/pytorch/serve/blob/f408eaea881791261aa0090e52d2e56820b2218c/frontend/server/src/main/java/org/pytorch/serve/util/messages/ModelWorkerResponse.java#L5 and codes and messages are also not lists https://github.com/pytorch/serve/blob/f408eaea881791261aa0090e52d2e56820b2218c/frontend/server/src/main/java/org/pytorch/serve/util/messages/ModelWorkerResponse.java#L17 but predictions are https://github.com/pytorch/serve/blob/f408eaea881791261aa0090e52d2e56820b2218c/frontend/server/src/main/java/org/pytorch/serve/util/messages/ModelWorkerResponse.java#L29

And may also require changes to https://github.com/pytorch/serve/blob/f408eaea881791261aa0090e52d2e56820b2218c/frontend/server/src/main/java/org/pytorch/serve/util/messages/RequestInput.java#L9

Relevant open issue

https://github.com/pytorch/serve/issues/1316

msaroufim avatar Nov 17 '21 21:11 msaroufim

Please take a look at this comment : https://github.com/pytorch/serve/issues/1316#issuecomment-1222058321

jamessocure avatar Aug 22 '22 08:08 jamessocure