http-parser
http-parser copied to clipboard
count the URL and headers separately when checking for HPE_HEADER_OVERFLOW errors
Counting headers, URL and start lines separately because it was counting everything as headers before the headers_done state, throwing HPE_HEADER_OVERFLOW errors even when what really caused the overflow was the request URL being too long
Creating an HPE_URL_OVERFLOW err to make sure embedders still have a request line limit, but now differentiating what went over such limit.
I think it would be useful to refer to what is being limited with the names from RFC's BNF.
What's the "start line"? Nothing in RFC 2616 is called that.
Request-Line has more than a Request-URI, it also has Method, HTTP-Version, and whitespace, any of which could be unreasonably long when written maliciously.
Does this measure URI completely separately from the other elements of the Request-Line? If so, is there some other limit on the Request-Line size?
@sam-github
What's the "start line"? Nothing in RFC 2616 is called that.
Take a look at the section 4.1 of RFC 2616, it briefly mentions it, but there's more detailed info regarding that in RFC 7230 section 3 and 3.1.
Request-Line has more than a Request-URI, it also has Method, HTTP-Version, and whitespace, any of which could be unreasonably long when written maliciously.
Does this measure URI completely separately from the other elements of the Request-Line? If so, is there some other limit on the Request-Line size?
This does measure URI completely separately and you're absolutely correct, there should be a limit for the request line size seeing I changed the way it was validated before. Guess I was too focused on the URI thing and forgot about the rest. Maybe I should add it here:
https://github.com/nodejs/http-parser/blob/3a5a436499bcb8a47f16fc2c3ae999e9ad9eb6d5/http_parser.c#L740-L742
And throw a new error like HPE_START_LINE_OVERFLOW, also have something similar to HTTP_MAX_HEADER_SIZE and HTTP_MAX_URL_SIZE macros like a HTTP_MAX_START_LINE_SIZE. How does that sound?
@bnoordhuis Should I do something like this:
#define HTTP_METHOD_LENGTH_MAP(XX) \
XX(0, 6) \
XX(1, 3) \
XX(2, 4) \
XX(3, 4) \
XX(4, 3) \
XX(5, 7) \
XX(6, 7) \
...
instead and then have the method_lengths declared using it? Like
static const uint8_t method_lengths[] =
{
#define XX(num, length) length,
HTTP_METHOD_LENGTH_MAP(XX)
#undef XX
};
that would probably decrease the impact of this change.
you may be able to do something like:
static const uint8_t method_lengths[] =
{
#define XX(num, name, string) uint8_t(sizeof(#string) - 1),
HTTP_METHOD_MAP(XX)
#undef XX
}
(but there may be a slightly less expensive way to just error-out if the request line overall is too big ...)
@ploxiln's suggestion works but it's still possible to exceed the overall limit. This PR stores state in stack locals url_offset
and spaces_before_url
but that information is lost when the input is spread over multiple http_parser_execute()
calls.
@bnoordhuis true. I can probably do away with url_offset
, it's just being used for comparison. Gonna try to figure something out for the spaces, though, since currently, we can have multiple before the actual url.