h11 Inconsistent parsing behavior

I want to report an inconsistent behavior we've observed. It affects production, was very confusing and hard to understand/debug.

Initially we had a replica set of hypercorn-based applications, and the same request was processed fine by the majority of instances, but not by one of a few others. The server didn't even log such requests.

After the investigation I've discovered that this limit is not always enforced. The library adds next chunk and tries to parse it before enforcing the limit. So if a request exceeds the limit, it will still be parsed successfully if it comes in one chunk (which is often the case).

So practically the same request could be parsed successfully or not depending on how the OS and TCP stack chunks it.

Here is a demo: h11-issue-demo.zip

$ pip install --requirement requirements.txt
$ pytest
FAILED test_client.py::test_fails_slightly_above_limit[uvicorn_endpoint-False] - assert not True
FAILED test_client.py::test_fails_slightly_above_limit[hypercorn_endpoint-False] - assert not True
========================================================================================== 2 failed, 10 passed in 2.16s ===========================================================================================

Inconsistency highlight:

$ pytest -k test_fails_slightly_above_limit
FAILED test_client.py::test_fails_slightly_above_limit[uvicorn_endpoint-False] - assert not True
FAILED test_client.py::test_fails_slightly_above_limit[hypercorn_endpoint-False] - assert not True
==================================================================================== 2 failed, 2 passed, 8 deselected in 0.84s ====================================================================================

I think it should be more stable and consistent. Should the library check buffer size before the parsing attempt? What do you think?

Apr 15 '25 08:04 anton-ryzhov

This affects at least uvicorn and hypercorn. CC @Kludex @pgjones

Apr 15 '25 08:04 anton-ryzhov

Seems like a reasonable thing to fix, want to send a PR?

Apr 24 '25 03:04 njsmith

I've started with tests and has found this and this. So it explicitly ensures this behavior. What was the motivation for such handling, do you remember?

The code is there almost since the very beginning BTW

May 19 '25 21:05 anton-ryzhov

The main purpose the limit exists in the first place is to prevent attackers trying to perform memory exhaustion attacks or trigger super-linear parsing times. I wasn't thinking much about consistency. The current behavior works fine for that. It's also important to allow arbitrarily large network chunk sizes for performance.

I guess looking at this again, this might be pretty difficult to fix without breaking valid code or causing a lot of performance overhead...

Not sure if it would make more sense to try to fix this here, or to improve hypercorn's logging/error-response here so if you do hit it then it's easy to debug...

May 19 '25 22:05 njsmith

The main purpose the limit exists in the first place is to prevent attackers trying to perform memory exhaustion attacks or trigger super-linear parsing times.

Yeah, this reason was clear.

It's also important to allow arbitrarily large network chunk sizes for performance

But what's the reason for that? Shouldn't the library check the _receive_buffer size in receive_data to protect it from uncontrolled growth?

Something like

connection.receive_data(b"GET /" + one_gb_of_chars + " HTTP/1.0\r\n\r\n")
connection.next_event()

won't help

to prevent attackers trying to perform memory exhaustion attacks or trigger super-linear parsing times

If it's expected that an application won't call receive_data with crazy-big buffers — is it maybe a good idea to limit buffer to max_incomplete_event_size?

Not sure if it would make more sense to try to fix this here, or to improve hypercorn's logging/error-response here so if you do hit it then it's easy to debug...

That affects other servers too, uvicorn at least. Logging there definitely could be improved, but won't make the case less confusing.

In our case the server would raise an error for 1-2% of exactly the same requests. I'd prefer to have them all failed (as they all practically violate max_incomplete_event_size limit) to see that in monitoring and fix the client

May 20 '25 22:05 anton-ryzhov

The thing is that we want to allow

connection.receive_data(b"POST / HTTP/1.0\r\n" + b"Content-Length: 1000000000\r\n\r\n" + one_gb_of_chars)
connection.next_event()

This is a totally normal, legit HTTP transaction, and forcing users to break up the data they pass to receive_data into small chunks just for our parsing convenience would make our API awkward and slow (and also be a backcompat break).

I do get how the inconsistency you're talking about is frustrating and confusing; I'm just not sure what the best way forward is given all the competing constraints.

Maybe there's something like, put a new limit specifically on headers, and set it up so that it's slightly more strict than the current limit, so in practice it should always get hit first (so it's consistent), and the existing limit becomes a just-in-case to catch any thing we missed?

May 28 '25 17:05 njsmith

Thanks for this explanation, now it's clearer to me.

Maybe there's something like, put a new limit specifically on headers

Yeah, that sounds reasonable. I'll think about that option

May 29 '25 08:05 anton-ryzhov

Maybe that could fix it?

Jul 05 '25 22:07 anton-ryzhov