dai-deployment-templates
dai-deployment-templates copied to clipboard
[ENHANCEMENT] Improves Error Messaging in Responses
Actually propagate some level of error message to the response.
Initially you just get a 400 - Bad Request
error and no reason for the failure. This fixes 2 problems:
- with introduction of shapley scoring. Shapley scoring in
/model/score
endpoint was potentially failing silently - generally this helps reduce the requirement of the users to go to the logs (which could be running in some infrastructure they don't have access to), so they can potentially fix typos and user errors more easily.
@orendain PTAL, I made a change which I like a bit more. Does not require any changes to api and actually allows the user to receive some message from errors.
@orendain thanks for taking a look!
wrt to your comments. I didn't really try to make the messaging any better per say... just actually make the messaging go to the user without having to dive into the rest server to get the answer... technically speaking the logs and the rest response will be the same at this point. Which I'd argue is a significant improvement over what we have now.
Also arguably swagger is the one that is defining a json response, and not spring. So while we could do it, I don't think its truly necessary. Given that the only thing you should get back in an error response is a code and a body. I think python requests
package handles it like that. 🤷♂️ I don't mind sending back a JSON, I just don't really see the point at the moment.
@orendain do you think it would make sense to have a generated swagger object ErrorResponse
? For me it seems like maybe overengineering?
Effectively, we're introducing a new feature: returning the full interpretation of a user's request, along with the raw underlying exception message. Depending on who you ask, returning these values is considered information leakage.
Heads up that the users receiving these values are often not the ones who have deployed the scorer or even have access to the logs (e.g., customers prop up a scorers for use by their customers).
The most common way to avoid this is by setting up a tripwire (try/catch) and masking/hiding the information with a safe user-facing message (which you've done by wrapped exceptions with the message "Failed scoring request ..." or "Failed loading CSV file:..." 👍🏾).
It's the exposure of the raw request and nested exception message that would be an unforeseen issue. They add some heft to the packet size, and TBH I'm not sure how a downstream component (e.g., monitoring apps or processes) could reliably parse this pattern for its own logging or propagation to its users.
@orendain this is a good point, although I would hope that our rest scorers responses are not being sent directly to end-users. Typically there's some check:
if response.status_code != 200:
give some customer error
Additionally, to a degree, this is similar behavior with our python scorers. https://github.com/h2oai/mlops-byom-images/blob/main/python-scorer/src/h2o_scorer/server/exceptions.py ... albeit without the raw request. I would say it doesn't necessarily hurt to get rid of the raw request in the error message.... but we could also get an exception due to: null
as the error message given that the java exceptions are not exceptionally verbose here.
Closing as stale