bbs
bbs copied to clipboard
Is it possible to implement a man-in-the-middle (MITM) tool to bypass censorship?
I spent a few time to think about it. Some talented developers have dedicated a lot of effort to obfuscating network traffic, making it appear as normal Chrome traffic. So why don't we implement an MITM tool to act as a true reverse proxy?
Here's my very simple idea: This tool listens for local SOCKS connections. Whenever a connection is made, it establishes a TLS communication with the client using a self-signed certificate and then forwards the network requests to the reverse proxy server (this step can use a trusted certificate). From any perspective, it completely avoids the issue of TLS-over-TLS.
Obviously, this tool is best suited for browsers. Firstly, the traffic characteristics of this tool entirely depend on the behavior of the SOCKS client, and browsers are naturally the best option. Secondly, browsers are the most friendly to self-signed certificates, whereas other apps do not trust self-signed certificates at all.
So, is this possible? Do dynamically generated elements within web pages (such as CAPTCHAs) support reverse proxies? I'm an outsider in this field and would appreciate hearing your opinions.
do I understand it right, you want to strip the TLS from user traffic before transmission, and renegotiate it on the server?
you can do this if you:
- trust the server to not become compromised and leak user traffic
- trust yourself to not become lazy and disable SSL verification instead of configuring a new root CA. This would be catastrophic if you disable VPN and the apps keep trying to connect. In browsers this is fine but in mobile apps, this is more tricky. On android, you can use apk-mitm to try your idea, but it's not secure.
- don't care about QUIC (it's usually better to block HTTP3 anyway)
- still forward ALPN of the removed TLS correctly
This is just off the top of my head. It seems to me this is fine for personal use but should probably not be widely promoted. But it's nice to solve TLS-in-TLS over CDN!
do I understand it right, you want to strip the TLS from user traffic before transmission, and renegotiate it on the server?
you can do this if you:
- trust the server to not become compromised and leak user traffic
- trust yourself to not become lazy and disable SSL verification instead of configuring a new root CA. This would be catastrophic if you disable VPN and the apps keep trying to connect. In browsers this is fine but in mobile apps, this is more tricky. On android, you can use apk-mitm to try your idea, but it's not secure.
- don't care about QUIC (it's usually better to block HTTP3 anyway)
- still forward ALPN of the removed TLS correctly
This is just off the top of my head. It seems to me this is fine for personal use but should probably not be widely promoted. But it's nice to solve TLS-in-TLS over CDN!
Yes, you are correct. This approach is essentially more like a packet capturing tool: it impersonates the server to handshake with the client and impersonates the client to handshake with the remote server. It decrypts all encrypted traffic.
To censors, this appears from every perspective as genuine HTTP requests. However, there are significant limitations: it is suitable only for and limited to browsers; it requires an uncensored and controllable server with a trusted certificate. If any of these conditions are not met, there is a risk of privacy leakage. Additionally, support for HTTP/3 needs to be considered.
This is not a universal solution but rather positioned as a "last resort" option to navigate through particularly strict censorship periods.
For the record: https://mailarchive.ietf.org/arch/msg/httpbisa/x5K6Bgoj4x-zzoKu4b-LCSMO5us/ https://isc.sans.edu/diary/Explicit+Trusted+Proxy+in+HTTP20+ornot+so+much/17708
It's not a totally impossible idea. If the MITM proxy is trusted and run by you, on your own computer, there's not necessarily any loss of confidentiality or integrity. It's more brittle than straightforward end-to-end TLS, in the sense that it's easier to make a mistake that results in a loss of security, but if you're very careful it can probably be done right. You might run into some problems with certificate pinning.
Some circumvention tools have done local MITM. GoAgent, the first or one of the first domain fronting tools, domain-fronted HTTP and HTTPS requests through Google App Engine. Unlike, say, meek, GoAgent did not use App Engine as a simple conduit to a trusted remote proxy; it exited traffic directly from the App Engine servers. To make that work, GoAgent needed to be able to tell the App Engine server exactly what URL to fetch, and for that to work with HTTPS requests, GoAgent needed to be able to decrypt the TLS and parse the HTTP request inside. GoAgent installed a local trusted root certificate authority and MITMed all requests passing through the proxy.
I think this is basically what you have sketched, @nlifew. GoAgent wasn't doing it for the sake of a TLS fingerprint (which didn't matter as much in those days), but so that the local proxy could do domain fronting, because that is something that browsers cannot do themselves.
I know this because GoAgent had severe security bugs in its MITM implementation that exposed users to actual MITM attacks. Basically, every user had the same trusted private key for the trusted root "GoAgent" certificate authority, and upstream TLS connections were not properly validated.
The GoAgent CA certificate is used to do a local (intentional) man-in-the-middle of HTTPS connections between the browser and proxy.py. GoAgent works by encoding HTTP requests received by proxy.py and sending them to gae.py, where gae.py makes the encoded request. gae.py then encodes the HTTP response and sends it back to proxy.py, where it is decoded and returned to the browser. In order for GoAgent to work with HTTPS sites, it needs to undo the encryption so that gae.py will know what URL to request. When proxy.py receives a CONNECT request (meaning an HTTPS site is requested), it generates and serves a fake certificate signed by the GoAgent CA. From the user's point of view, all HTTPS sites are verified by "GoAgent". In some browsers, certificate pinning prevents the GoAgent technique from working for a small number of sites. (A consequence of GoAgent's model is that HTTPS is not end-to-end. It is HTTPS between the user and App Engine, and HTTPS between App Engine and the web site, but App Engine gets to see the plaintext.)
I don't know the details, but Lantern at one point may also have used local MITM for domain fronting without an additional proxy hop or protocol overhead, which they called "direct domain fronting". I'm not sure whether or to what extent they still do it.
https://www.bamsoftware.com/papers/fronting/#sec:deploy-lantern-direct
Direct domain fronting
The Lantern network includes a geolocation server. This server is directly registered on the CDN and the Lantern client domain-fronts to it without using any proxies, reducing latency and saving proxy resources. This sort of direct domain fronting technique could in theory be implemented for any web site simply by registering it under a custom domain such as facebook.direct.getiantem.org. It could even be accomplished for HTTPS, but would require the client software to man-in-the-middle local HTTPS connections between browser and proxy, exposing the plaintext not only to the Lantern client but also to the CDN. In practice, web sites that use the CDN already expose their plaintext to the CDN, so this may be an acceptable solution.
Somewhat related, if you build a database of which domain names use which CDNs, and the IP addresses of CDN edge servers, you can send domain-fronted requests to a CDN edge server appropriate for each domain name. In this special case, you don't need any MITM or local proxy; the CDN edge server is effectively the proxy. This is what CacheBrowsing and CDNBrowsing do.
Thank you very much, @klzgrad @wkrp. It really helps me a lot. lol
See also https://github.com/poscat0x04/sni-modifier (and this paper) which disables the SNI extension.
See also https://github.com/poscat0x04/sni-modifier (and this paper) which disables the SNI extension.
Thanks for the reference to this paper. I wasn't aware of it. Published in 2015, it is contemporary with domain fronting and expresses basically the same idea:
Our main example is for Google services, since many services share the same Google server certificate like Youtube or Google Maps. Assuming Youtube is restricted by an SNI filtering but Maps isn’t, the bypassing technique works as follows:
- TLS Socket with domain name and port:
TARGET_HTTPS_SERVER= maps.google.com and TARGET_HTTPS_PORT=443;- Create SNI object with:
server_name = maps.google.com- Get access to Youtube by sending HTTP host header:
GET/HTTP/1.1/r/n HOST:www.youtube.com:443As shown in Figure 3, we make the handshake using Google Maps and receive a server certificate for Maps, then we send HTTP host header for ”www.youtube.com”, and we get all Youtube traffic encrypted with maps server certificate, both web sites sharing the same infrastructure. Once the traffic is encrypted, the firewall can not detect Youtube traffic anymore based on SNI.
Compare to https://www.bamsoftware.com/papers/fronting/#sec:introduction:
The key idea of domain fronting is the use of different domain names at different layers of communication. In an HTTPS request, the destination domain name appears in three relevant places: in the DNS query, in the TLS Server Name Indication (SNI) extension, and in the HTTP Host header. Ordinarily, the same domain name appears in all three places. In a domain-fronted request, however, the DNS query and SNI carry one name (the “front domain”), while the HTTP Host header, hidden from the censor by HTTPS encryption, carries another (the covert, forbidden destination).
This Wget command demonstrates domain fronting on Google, one of many fronting-capable services. Here, the HTTPS request has a Host header for maps.google.com, even though the DNS query and the SNI in the TLS handshake specify www.google.com. The response comes from maps.google.com.
$ wget -q -O - https://www.google.com/ --header 'Host: maps.google.com' | grep -o '<title>.*</title>' <title>Google Maps</title>
But the idea existed earlier than that. GoAgent was using a version of it in 2013 or earlier, and a 2012 blog post by Bryce Boe had the essential ideas in embryo ("it would still be possible, in the presence of an SNI-hostname white-list, to bypass Gogo’s authentication with a custom browser (or local proxy server) that negotiates TLS handshakes using only the expected SNI-hostname yet still makes the desired Google web service request").
However, it doesn't look like the "Escape" plugin from "Efficiently Bypassing SNI-based HTTPS Filtering" uses MITM, which means it works differently than sni-modifier.
Does anyone reckon a way to completely disable SNI in a modern browser that also does not require compiling/modifying sources? Simply stripping SNI still works in some cases.
@wkrp
However, it doesn't look like the "Escape" plugin from "Efficiently Bypassing SNI-based HTTPS Filtering" uses MITM
I'm quite sure Escape uses MITM as well. I remember tinkering with that extension. It's just a MITM proxy implemented as XPCOM extension. It's a shame Mozilla has deprecated that.
- used for the local MITM, as well as generating and signing target
https://github.com/goichot/Escape/blob/8b8d7938e080f53b1fadfd7bc16708f6dc10747e/Escape/chrome/content/ssl/CertificateManager.js#L26
Thank you, I stand corrected. Even so, it strikes me as a strange implementation choice. TLS MITM in an external proxy, sure, but in a browser extension it shouldn't be necessary, as you can intercept outgoing requests and rewrite headers.
For example, the old zh.wiki.unblocked browser extension (the only mirror I can find now is here) had the single purpose of domain-fronting zh.wikipedia.org with wikipedia.org. It worked by overwriting the URL and Host header in an "http-on-modify-request" handler.
Similarly, the browser extension that was once used by meek in Tor Browser for TLS camouflage crafted new requests from scratch from a JSON specification received over a local socket, no MITM required.
WebExtensions let you rewrite some HTTP headers at best but you can't touch or even inspect anything TLS related like SNI. There is just no API for that and there won't be one in the foreseeable future. The trend is to restrict the capabilities of extensions. Escape authors could not manage it even with legacy Firefox API which was much powerfull than what we have now.
the old zh.wiki.unblocked browser extension (the only mirror I can find now is here) had the single purpose of domain-fronting zh.wikipedia.org with wikipedia.org
I believe that works only because it's a very specific edge case. No cookies required, static page, requested domain is a subdomain of a fronting domain. Internally they open a tab that points to https://wikipedia.org which takes care of SNI and loosens browser restrictions for zh.wikipedia.org content. Same trick probably would not work for Github or YouTube because of domain mismatch and how browsers handle that. You would still get raw HTTP response but you won't be able to correctly render it.
Similarly, the browser extension that was once used by meek in Tor Browser for TLS camouflage crafted new requests from scratch from a JSON specification received over a local socket, no MITM required.
Actually it does look like MITM in a sense that this extension terminates TLS.
That's fair. While it's possible for a browser extension to craft HTTPS requests where SNI ≠ Host, I see see how it might not be possible to make that work well with the user interface in the same browser.
Found another instance of mitm domain fronting tool in the wild: https://github.com/mashirozx/Pixiv-Nginx/
I don't know the details, but Lantern at one point may also have used local MITM for domain fronting without an additional proxy hop or protocol overhead, which they called "direct domain fronting". I'm not sure whether or to what extent they still do it.
We do not currently MITM encrypted traffic for domain fronting and I don't believe we ever did. In my understanding, that idea was only ever theoretical for us.