nDPI icon indicating copy to clipboard operation
nDPI copied to clipboard

Are there any plans for http2's deep packet inspection function?

Open hanyoungho opened this issue 1 year ago • 4 comments

nDPI is capable of parsing http host, uri, user-agent, etc. for the http protocol and receiving that information, but as far as I know there is no parser for HTTP 2.0.

Do you have any plans for parsing HTTP 2.0? Of course, I know that HTTP 2.0 is encrypted traffic, but when decrypted traffic comes in, a simple parsing task seems be needed like wireshark does.

In particular, Chatgpt is a hot topic these days, and since most chatbot-ai, including chatgpt, are based on HTTP2, it seems quite necessary.

hanyoungho avatar Sep 04 '23 10:09 hanyoungho

I think, please correct me if I am wrong, HTTP2 is already dying while it was never really alive.

Although it makes sense to dissect HTTP2, most of the ChatGPT traffic that I am observing uses TLS. So even with HTTP2 dissection, you'll won't get much more information IMHO.

utoni avatar Sep 07 '23 13:09 utoni

AFAIK, no, there are no plans.

A patch to detect (un-encrypted) HTTP/2 has just been pushed.

While HTTP/2 is one of the most used used protocols on "internet" (basically all the HTTP traffic over TCP is HTTP/2, see https://radar.cloudflare.com/traffic), it is pretty much always encrypted, i.e. transported over TLS. You can see plaintext HTTP/2 only on 3 cases, AFAIK: *) some (very, very) uncommon applications/apps use it without TLS *) in a 5G core network *) if you are the man-in-the-middle (example: proxies), i.e. you have access to the plaintext data In my opinion, these are quite uncommon scenarios.

Furthermore, HTTP/2 is a binary protocol and extracting metadata likely requires some third-party library.

So, I think that to support HTTP/2 metadata extraction in nDPI we need some very interested party willing to help with the task.

[same considerations apply to HTTP/3]

IvanNardi avatar Sep 16 '23 11:09 IvanNardi

Why is there such a big gap between https://w3techs.com/technologies/details/ce-http2 and https://radar.cloudflare.com/traffic? Do I miss something? A quick (and very subjective) observation of my currently opened tabs is more closer to the percentage of w3techs.com. Btw the web version of ChatGPT uses HTTP/3 for most of the traffic. Seems like some CDN content gets delivered via HTTP/2.

utoni avatar Sep 16 '23 12:09 utoni

Why is there such a big gap between https://w3techs.com/technologies/details/ce-http2 and https://radar.cloudflare.com/traffic? Do I miss something?

From the first site: ~35%. From the second: 60.9% of the HTTP traffic -> 60.9 of ~50% [average value from the graph "Internet traffic trends" at the beginning of the page] -> ~30%. I think that we can consider these two values pretty much equal, even if these numbers are quite rough.

If I should guess, if your network and browser support QUIC most of the traffic is via QUIC-HTTP/3 (surely if the server is behind cloudflare/akami or google/meta). No sure about other CDNs; Netflix for sure still uses TCP.

Bottom line: if the (HTTP) connection is not QUIC, is it likely HTTP/2

IvanNardi avatar Sep 16 '23 13:09 IvanNardi