vert.x
vert.x copied to clipboard
The uri() method and the absoluteURI() method in HttpServerRequestImpl are inconsistent in judging the legality of uri.
Vert.x 3.6.2
This is a very simple server code
public void start() throws Exception {
vertx.createHttpServer().requestHandler( req -> {
System.out.println( req.uri());
System.out.println( req.absoluteURI());
req.response()
.putHeader("content-type", "text/plain")
.end("Hello from Vert.x!");
}).listen(8080);
}
When I request the address http://127.0.0.1:8080/test?a={1}
in the browser, the uri() method will not report an error, but the absoluteURI() print error will return a null value. This inconsistent performance is not reasonable.
The contents of the error are as follows:
/test?a={1}
17:26:41.548 [vert.x-eventloop-thread-0] ERROR io.vertx.core.http.impl.HttpServerRequestImpl - Failed to create abs uri
java.net.URISyntaxException: Illegal character in query at index 8: /test?a={1}
at java.net.URI$Parser.fail(URI.java:2848)
at java.net.URI$Parser.checkChars(URI.java:3021)
at java.net.URI$Parser.parseHierarchical(URI.java:3111)
at java.net.URI$Parser.parse(URI.java:3063)
at java.net.URI.<init>(URI.java:588)
at io.vertx.core.http.impl.HttpUtils.absoluteURI(HttpUtils.java:362)
at io.vertx.core.http.impl.HttpServerRequestImpl.absoluteURI(HttpServerRequestImpl.java:379)
...
...
...
null
The first line is the result returned by the uri() method. The last line is the result returned by the absoluteURI() method.
this sounds like a bug rather that being inconsistent.
uri returns the request URI, i.e what the request sent in the status line
the absolute URI is the URI that is computed from various parts using the host, etc... it seems there is a bug in this case
Yes. I think it is. I will try to see the source code to see if it can be fixed.
please find the nature of the bug and rename this issue after it
ok
Looking at the HttpServer code this seems to be caused by the fact that uri()
returns the String value of the underlying netty request which is created by the HttpObjectDecoder. This does no validtion that this is String is in anyway a valid java.net.URI, it simply split the initial line by whitespace.
.absoluteURI()
internally calls HttpUtils.absoluteURI(...)
which validates the result of uri()
by calling new URI(req.uri())
. The URI is not needed apart from validation in this function.
To fix this Bug we either need to always validate what is returned by .uri() or stop validation in .absoluteURI() @vietj What would you suggest ?
we don't do validation on purpose, error happens because we use java.net.URI
object to parse, so we should parse differently I guess
The problem with absoluteURI() is mainly because there are unsafe characters in the get request. This was discussed in https://github.com/eclipse-vertx/vert.x/issues/1997. But I think that since uri() returns correctly, then absoluteURI() must also be returned in the correct way. After careful study, I think it is the impact of this code: https://github.com/eclipse-vertx/vert.x/blob/f68ae15171053d343590e54884ec377f095ba6b3/src/main/java/io/vertx/core/http/impl/HttpUtils.java#L360-L366
I don't understand why new URI(req.uri())
, Then get the value of uri.getScheme()
, because uri.getScheme()
always returns null.
Going back to the issues we are currently discussing, because the new URI() encounters an unsafe character, then the JDK throws an exception.
I have provided optimized code and unit tests here: https://github.com/aruis/vert.x/commit/c9c8813e5e6a0ab173e1155f1e7aa700ee7a4ee7 if you feel it is appropriate, I will pull request.
Please forgive my lame English. Looking forward to your reply @vietj
- But I think that since uri() returns correctly, then absoluteURI() must also be returned in the correct way. : I don't agree with you, both are different.
-
uri.getScheme()
does not always return null, a request URI can be an absolute URL, so you need to take it in account and test it. See https://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html 5.1.2 :Request-URI = "*" | absoluteURI | abs_path | authority
I understand that uri can contain scheme, but I have not tested in vert.x that uri.getScheme() is not null.
we don't do validation on purpose
If validation isn't needed here, then my intuition would be that the existing code
https://github.com/eclipse-vertx/vert.x/blob/cbf38bfc2cfcceb4ccbae89ba6619f0a5158dff8/src/main/java/io/vertx/core/http/impl/HttpUtils.java#L364-L379
could be adapted by replacing
URI uri = new URI(req.uri());
String scheme = uri.getScheme();
if (scheme != null && (scheme.equals("http") || scheme.equals("https"))) {
absoluteURI = uri.toString();
with
String uri = req.uri();
if (uri.startsWith("http://") || uri.startsWith("https://")) {
absoluteURI = uri;
to skip parsing the URI.
In general, if not wanting to let HttpServerRequest.absoluteURI
throw an URISyntaxException
, I think it would be better to return the unvalidated URI string there, instead of null
.
Then the invoking code can do validation if needed.
The current logging on ERROR level here
https://github.com/eclipse-vertx/vert.x/blob/29cc5bd748cd5e352c40e7f86352cac129d40c0a/src/main/java/io/vertx/core/http/impl/HttpServerRequestImpl.java#L372-L374
is something that we found to cause unnecessary clutter in production log output, especially when req.absoluteURI()
gets called multiple times during request processing (e.g. for log/tracing output).
@vietj Any thoughts on the above? Doing the "startsWith("http(s)://" check instead of parsing the URI to check for an absolute URI.
I need to look at this again @calohmn
I've started to read this again and gather information, here is what I found:.
-
{
and}
are not valid character in query string of an URI - Safari will anyway send those char in clear text in the request URI (I checked with Wireshark), other browsers might not
- the URI rfc3986 does not say anything specific about those chars (they are simply not allowed)
- the URL rfc1738 mention
{
and}
as unsafe- All unsafe characters must always be encoded within a URL.
- These characters are "{", "}", "|", "", "^", "~","[", "]", and "`".
- there are interesting concerns about this
- https://github.com/SergioBenitez/Rocket/issues/924
- https://stackoverflow.com/questions/53766725/curl-does-not-work-with-urls-with-curly-braces-in-parameters
- the
HttpServerRequest#uri()
method returns the HTTP request uri as provided by the HTTP request with no interpretation - the
HttpServerRequest#absoluteURI()
method returns a best effort to compute an URL that was called by the client, e.g it might use the HTTP server bind address when the request headerhost
is not present)
So it looks like the current implementation is correct as it will not return an invalid URL, however it might not be the best choice for the application.
I think we could add some leniency here and add a boolean flag that will validate or not the returned absolute URI and the default implementation would be strict (no breaking change).
scoped to 4.1.1 - back port might be possible
Fixed by https://github.com/eclipse-vertx/vert.x/pull/5158