serve
serve copied to clipboard
Request level http status codes in batch
@lxning @sharvil
Request level error codes and batch responses
Right now the way we handle batch responses is all clients get the same inference for a batch. This is confusing for many reasons
- Clients can't trust that http 200 code means that their example in a batch was successful
- Requires serve users to demux error logs meant for a specific client
Proposed solution
Middleware
One solution is for a middleware layer that would tag each request for a client, parse it in the preprocess handler and make sure that the same order is maintained in the post-process handler. This works but is annoying and doesn't solve the confusion around why is each client getting a 200 error. Over time this middleware would need to do a lot
Refactor BatchAggregator
https://github.com/pytorch/serve/blob/master/frontend/server/src/main/java/org/pytorch/serve/wlm/BatchAggregator.java#L58
The current batch aggregator returns only a single message code public void sendResponse(ModelWorkerResponse message) but ideally this should return a list list
This may be a bit tricky to implement since ModelWorkerResponse also doesn't work with Lists https://github.com/pytorch/serve/blob/f408eaea881791261aa0090e52d2e56820b2218c/frontend/server/src/main/java/org/pytorch/serve/util/messages/ModelWorkerResponse.java#L5 and codes and messages are also not lists https://github.com/pytorch/serve/blob/f408eaea881791261aa0090e52d2e56820b2218c/frontend/server/src/main/java/org/pytorch/serve/util/messages/ModelWorkerResponse.java#L17 but predictions are https://github.com/pytorch/serve/blob/f408eaea881791261aa0090e52d2e56820b2218c/frontend/server/src/main/java/org/pytorch/serve/util/messages/ModelWorkerResponse.java#L29
And may also require changes to https://github.com/pytorch/serve/blob/f408eaea881791261aa0090e52d2e56820b2218c/frontend/server/src/main/java/org/pytorch/serve/util/messages/RequestInput.java#L9
Relevant open issue
https://github.com/pytorch/serve/issues/1316
Please take a look at this comment : https://github.com/pytorch/serve/issues/1316#issuecomment-1222058321