HTTP/2 error when parsing percent-encoded query string
Very simple reproducer with explanation: https://github.com/mkurz/akka-http-percent-encoding-bug (that reproducer is still based on akka-http but should work 1:1 when applied to pekko-http)
Copy and pasted README (in case I ever delete that repo):
pekko-http's (experimental?) http2 support does not correctly handle URL parsing errors.
Specially when passing an invalid percent-encoded character in the path and/or query string (here %_D):
As reproducer the example from the akka docs can be used, but with HTTP/2 enabled:
- https://doc.akka.io/docs/akka-http/current/introduction.html#low-level-http-server-apis
- https://doc.akka.io/docs/akka-http/current/server-side/http2.html#enable-http-2-support
# Start the server
sbt run
# HTTP/1 works:
$ curl --http1.1 -v http://localhost:8080/?param=%_D
* Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /?param=%_D HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.87.0
> Accept: */*
>
* Mark bundle as not supporting multiuse
< HTTP/1.1 400 Bad Request
< Server: akka-http/10.2.10
< Date: Wed, 15 Feb 2023 16:29:18 GMT
< Connection: close
< Content-Type: text/plain; charset=UTF-8
< Content-Length: 78
<
* Closing connection 0
Illegal request-target: Invalid input '_', expected HEXDIG (line 1, column 10)
# HTTP/2 gives error:
$ curl --http2-prior-knowledge -v http://localhost:8080/?param=%_D
* Trying 127.0.0.1:8080...
* Connected to localhost (127.0.0.1) port 8080 (#0)
* Using HTTP2, server supports multiplexing
* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0
* h2h3 [:method: GET]
* h2h3 [:path: /?param=%_D]
* h2h3 [:scheme: http]
* h2h3 [:authority: localhost:8080]
* h2h3 [user-agent: curl/7.87.0]
* h2h3 [accept: */*]
* Using Stream ID: 1 (easy handle 0xaaabd90f0dc0)
> GET /?param=%_D HTTP/2
> Host: localhost:8080
> user-agent: curl/7.87.0
> accept: */*
>
* Connection state changed (MAX_CONCURRENT_STREAMS == 256)!
* Closing connection 0
curl: (16) Error in the HTTP2 framing layer
At some point, no matter if using HTTP/1 or HTTP/2, the parser ends up here
- https://github.com/apache/incubator-pekko-http/blob/6624fa98f43849f791bccd14c1ec3d80d091a994/http-core/src/main/scala/org/apache/pekko/http/impl/model/parser/UriParser.scala#L253-L256
As you can see it will (also) be looked for pct-encoded characters, which expect a % sign, followed by two HEXDIG signs:
- https://github.com/apache/incubator-pekko-http/blob/6624fa98f43849f791bccd14c1ec3d80d091a994/http-core/src/main/scala/org/apache/pekko/http/impl/model/parser/UriParser.scala#L300-L305
Now if that fails...
- ...when using HTTP/1...
...then the method parseHttpRequestTarget fails with (=throws) an IllegalUriException, which will later be handled so a 400 Bad Request with following body will be send:
Illegal request-target: Invalid input '_', expected HEXDIG (line 1, column 10)
(There are various places in the http-core module where IllegalUriExceptions and ParsingExceptions are caught and handled)
- ...when using HTTP/2...
PathAndQuery.parse in following line throws an ParsingException which is never handled (this is in the org.apache.pekko.http.impl.engine.http2 package, so not used by HTTP/1):
- https://github.com/apache/incubator-pekko-http/blob/6624fa98f43849f791bccd14c1ec3d80d091a994/http-core/src/main/scala/org/apache/pekko/http/impl/engine/http2/hpack/HeaderDecompression.scala#L71
A couple of lines later only an IOException gets caught:
- https://github.com/apache/incubator-pekko-http/blob/6624fa98f43849f791bccd14c1ec3d80d091a994/http-core/src/main/scala/org/apache/pekko/http/impl/engine/http2/hpack/HeaderDecompression.scala#L86-L96
If you see where the containing method parseAndEmit gets called, the ParsingException never gets handled.
Expected Result
When using HTTP/2 I would expect to also receive a 400 Bad Request with the body:
Illegal request-target: Invalid input '_', expected HEXDIG (line 1, column 10)
thanks @mkurz for the detailed description of the issue
It is probably still useful for you log this in akka-http project too but it's good not to assume that we will be following issues raised in akka projects (especially, since we can't copy any of the solutions)
@pjfanning I know, I just wanted to let them know as well.
@jrudolph @mdedetrich is this something that we would like to fix before a pekko-http 1.0.0 release?
Ill try and look into this tomorrow, doesn't seem to hard to solve since @mkurz did an excellent job in specifying how to solve the bug
Alternately @mkurz would you want to contribute the fix, it seems like you are ontop of things here?
In any case, it's not a blocker. It's somewhat unfortunate that some error cases fail more extremely than desired but it does not change the outcome a lot. Btw. the nice error message is also disabled by default for HTTP/1.1 so it's mostly about the status code.
I am sorry, I have so much other work to do I don't have the capacity right now to work on this issue here.
I am sorry, I have so much other work to do I don't have the capacity right now to work on this issue here.
No worries
In any case, it's not a blocker. It's somewhat unfortunate that some error cases fail more extremely than desired but it does not change the outcome a lot. Btw. the nice error message is also disabled by default for HTTP/1.1 so it's mostly about the status code.
Ill see if I have time for this, the fix doesn't seem that hard (just need to add an extra case where the exceptions are caught).
Ill see if I have time for this, the fix doesn't seem that hard (just need to add an extra case where the exceptions are caught).
I'll have a look quickly, also where to put the tests, and let you know.
added 1.0.0 milestone marker but it's a nice to have as opposed to a release blocker
There are several complications here:
- our HTTP/2 client does not allow send invalid URIs (
Raw-Request-URInot yet supported) - there's currently no way to signal non-fatal parsing errors (should we treat it non-fatal, though?) to the demuxer, so in any case the connection will be torn down after sending the response, together with all other ongoing streams. This seems to make it more reasonable to handle these HTTP-layer errors (as opposed to HTTP/2-framing errors) with HTTP error codes which however means that we need to find ways to signal those errors through the layers (i.e. introduce another synthetic frame type).
So, nothing to fix quickly, but I'll have a look into it when I find some time.
there's currently no way to signal non-fatal parsing errors (should we treat it non-fatal, though?) to the demuxer, so in any case the connection will be torn down after sending the response, together with all other ongoing streams. This seems to make it more reasonable to handle these HTTP-layer errors (as opposed to HTTP/2-framing errors) with HTTP error codes which however means that we need to find ways to signal those errors through the layers (i.e. introduce another synthetic frame type).
So I don't know HTTP2 implementation very in depth, but this to me appears like a non-fatal/business logic error (i.e. it happens when the client sends a malformed URI so its in the same class of errors like validation). If as you say there is no way currently for the HTTP2 to gracefully handle business/validation errors, we should probably fix that first before approaching this.
@pjfanning Considering whats just been said I think its safer to park this for now and remove it from the 1.0.0 milestone