druid
druid copied to clipboard
Unify HTTP response model and API
Motivation
Current HTTP APIs return different formats of response. In most cases, if the status code is 200, a JSON object is returned. If the status code is not 200, there're several formats, some return a JSON object, some return text. For those APIs that return text, there might be vulnerabilities that allows hackers to launch an XSS attack.
In order to make code safe, this PR unifies the API that construct a safe Response object.
After this PR, for those HTTP responses whose Status Code is not 200:
- there's alway a Content-Type in the response header which is set to 'application/json'
- and the HTTP body is a JSON object which currently has one field named as "error".
{"error" : "xxxx"}
In order to make sure the new API is always used in the future code, the raw javax.ws.rs.core.Response#status
API is forbidden.
API Description
The New API is defined as HttpResponses
. Here's an example that compares how we construct a Response object in old and new API.
Old ways
return Response.status(Response.Status.BAD_REQUEST)
.entity(ImmutableMap.<String, Object>of(
"error",
StringUtils.format(UNKNOWN_AUTHENTICATOR_MSG_FORMAT, authenticatorName)
))
.build();
New API
With the New API, above code can be simplified as
return HttpResponses.BAD_REQUEST.error(UNKNOWN_AUTHENTICATOR_MSG_FORMAT, authenticatorName);
API lists
Below are the APIs that are provided by HttpResponses
// returns a WebApplicationException that wraps a Response object
public RuntimeException exception(String message);
// returns a HTTP Response whose Content-Type is text/plain
public Response text(String message);
// returns an HTTP Response in error format. And the Content-Type is application/json
public Response error(String message);
// returns an HTTP Response in error format. And the Content-Type is application/json
public Response error(@Nullable Throwable t);
// returns an HTTP Response in JSON format with given object
public Response entity(@Nonnull Object entity)
Incompatibility
Since this PR changes all current code to use the new API, there're some compatibility problems. Some HTTP endpoints now returns a JSON instead of returning a text-based response if the Response's status code is not 200. If clients have code to check the returned message for error processing, they need to change their code to adapt to the new returned format.
What's not done in this PR
Response.ok()
can be replaced by the new API HttpResponses.OK.json()
so that we don't mix the old Response
and new HttpResponses
in the code.
But it's not forbidden in this PR because the new one is just a replacement which won't introduce any compatibility problems. Since there're lots of reference of such API, and changes in this PR is already too large, I think we can do that replacement in a following PR once this is merged.
Key changed/added classes in this PR
-
ResponseStatusException
is added to wrap any exceptions that want to return specific HTTP status code -
ResponseStatusExceptionMapper
is provided to map anyResponseStatusException
to a HTTP response which contains the error message in json format -
HttpResponses
defines a set of common used HTTP status codes and APIs to return a Response object.
This PR has:
- [X] been self-reviewed.
- [ ] using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
- [ ] added documentation for new or modified features or behaviors.
- [ ] added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
- [ ] added or updated version, license, or notice information in licenses.yaml
- [ ] added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
- [ ] added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
- [ ] added integration tests.
- [X] been tested in a test Druid cluster.
Thanks for the changes, @FrankChen021! This makes the error responses much cleaner!
I had a couple of minor suggestions:
- In
ResponseStatusExceptionMapper
, consider changing the signature of method.toResponse()
to includeObject... args
so that theString.format
can be done there instead of every caller having to do it. - Consider adding a syntax sugar for bad request and not found (as these are most commonly used) over
.toResponse()
i.e. something like.toBadRequestResponse()
so that the caller can skip passing theResponse.Status.BAD_REQUEST
in the common cases.
@kfaraz My opinion is that we should use exception-thrown style instead of returning statement. The reason I keep a toResponse
function to return a Response
object is that there're lots of code using returning statement, and to change them all will introduce a burden of test.
Hey @FrankChen021, can you add some tests to verify the new response?
@jihoonson This patch updates many returning statements, it's hard to add cases for all these changes.
For the tests:
I added a UT test case to check the returned Response
object, including the Header, Object and Status Code; can check this file: indexing-service/src/test/java/org/apache/druid/indexing/overlord/http/OverlordResourceTest.java
I also updated current integration test to check if the ResponseStatusExceptionMapper
successfully coverts exception into Response
object and returned Header, Content and Status are correct. See:
integration-tests/src/test/java/org/apache/druid/tests/query/ITOverlordResourceTest.java
@kfaraz I added an override version to accept Object... args
parameters to simplify. And also I added some comments to these functions to give suggestions that when 400 or 404 is expected to return, exception throwing style is recommended.
CI failed because of insufficient line coverage.
@jihoonson Could you review this PR when you're at convenience? I saw there were conflicts with the latest changes on master, and worried about more conflicts.
@FrankChen021 my apologies, I forgot to take another look. Please feel free to ping me if I forget again and don't finish my review in time.
I took another look. Thank you for adding more tests to satisfy the coverage bot :slightly_smiling_face: The general approach you took in this PR looks good to me. It's much better-structured compared to what we had previously. However, I still have a concern about the user-facing API changes, which I mentioned in https://github.com/apache/druid/pull/12026#discussion_r770186406. I agree that your approach is better than what we have now because it makes the error response format consistent. I think we should apply the same approach to every user-facing API. And I think this should happen at once because it will cause incompatibility in the error response format for some APIs. Suppose you have a client that does something when an API returns an error. For example, it could retry the failed call. If the error response format changes after this PR, you will have to modify your client code to handle the new response format. Even though the new response format is better than the previous, it will be annoying our users if we introduce such an incompatible change over multiple releases. Since we will probably start preparing a new release sooner or later, we may not be able to have enough time to apply the same technique you used in this PR to every user-facing API. So, how about splitting your PR into two, one that fixes the vulnerability only and another that fixes the error response format?
@jihoonson A good suggestion. Let me create a new PR to fix the vulnerability only and leave this PR to unify the response model.
@FrankChen021 thanks for understanding. I will review #12295 soon.
Description has been updated to reflect the latest change. And the old API will be added to the forbidden API list in later commits.
This pull request introduces 1 alert when merging 118d26bb26c304f3204742c72c839de14112b9c9 into 903174de2045449e12f94d00bdca09ecaf780e64 - view on LGTM.com
new alerts:
- 1 for Unused format argument
@kfaraz @jihoonson Could you please take a look at the changes in this PR when you're at convenience?
Sure, @FrankChen021 . Sorry for the delay on this. Will take a look this week.
This pull request has been marked as stale due to 60 days of inactivity. It will be closed in 4 weeks if no further activity occurs. If you think that's incorrect or this pull request should instead be reviewed, please simply write any comment. Even if closed, you can still revive the PR at any time or discuss it on the [email protected] list. Thank you for your contributions.
This pull request/issue has been closed due to lack of activity. If you think that is incorrect, or the pull request requires review, you can revive the PR at any time.