truss icon indicating copy to clipboard operation
truss copied to clipboard

Throw 500s when Triton encounters exceptions

Open aspctu opened this issue 1 year ago • 0 comments

:rocket: What

This PR updates the Triton / TRT-LLM template to throw 500s when it encounters an exception. This only applies in the non-streaming usecase.

:computer: How

We throw a FastAPI HTTPException when encountering a Triton InferenceServerException. Thanks to the change in this PR, Truss will automatically propagate the exception to the underlying FastAPI server and pass more granular response types / status codes to the client.

:microscope: Testing

I reproduced this code in an individual truss and confirmed that when stream is set to False, exceptions return responses with a 500 status code and appropriate message.

aspctu avatar Apr 19 '24 22:04 aspctu