probe-cli feat(engine): add randomtraffic experiment

feat(engine): add randomtraffic experiment

Open JaxGames5225 opened this issue 2 years ago • 0 comments

Checklist

[x] I have read the contribution guidelines
[x] reference issue for this pull request: https://github.com/ooni/probe/issues/2447
[x] if you changed anything related to how experiments work and you need to reflect these changes in the ooni/spec repository, please link to the related ooni/spec pull request: https://github.com/ooni/spec/pull/271
[x] if you changed code inside an experiment, make sure you bump its version number

Description

This test aims to detect the censorship of fully random traffic. In short, the experiment sends random bytes to an IP address chosen at random from a list of pre-determined public IP addresses that were affected by this censorship in the past and records information about the nature of censorship. This censorship was originally detected from the Great Firewall of China (GFW).

Censorship Description

Our team reverse engineered the GFW's new censorship system and determined that it uses the following rules to exempt traffic from blocking:

For the first TCP payload sent by the client, allow the traffic to continue if any of the following hold:

It matches the protocol fingerprint for TLS or HTTP.
The first six bytes of the payload are all [0x20, 0x7e].
More than 50% of the payload are [0x20, 0x7e].
More than 20 contiguous bytes of the payload are [0x20, 0x7e].
popcount(payload)/len(payload) is less than 3.4 or greater than 4.6.

In addition to these rules, the censorship only occurs when connecting to a certain list of IP addresses.

If the IP address is in the censored range and none of the above hold, there is an approximate 26.3% chance the connection is censored. For a more detailed description of the censorship please see the reading copy of our paper.

Test Goals and Procedure

The main goal of the test is to inform the user whether or not they are experiencing censorship on connections that send fully encrypted packets that appear random, as well as to record information about censored packets in order to better understand the censorship algorithm. The test seeks to accomplish these goals by doing the following:

If no IP address is given by the user, select an IP address from the list of IP addresses in the affected range.
Complete a TCP handshake with the IP address and send a stream of null bytes as a control test. If this control test succeeds then proceed with the experiment, otherwise attempt the control test with a new IP address two more times or until the control test is successful. If no control test succeeds end the test and return the error.
Complete a TCP handshake with the IP address and send a stream of random bytes. If this connection times out, we attempt to connect once more to check for residual censorship. If the residual censorship test results in a timeout, we end the test, record information about the blocked packet, and inform the user they are experiencing censorship. Otherwise we continue with the test.
Step 3 is repeated 19 more times to account for the blocking rate.
If no errors occurred and the test was completed, all connections are then closed and the test informs the user they are not experiencing censorship.

False Negative and False Positive Rates

Using an IP known to be in the censored range, the false negative rate (the rate at which the test will say there is no censorship present when in fact there is) of this test was calculated to be approximately 1.05%. On the other hand, after running the test 10,000 times from a location not experiencing censorship, no false positives were recorded.

IP List Construction

The IP list was created by first obtaining a large list of public TCP servers. The test was then performed five times on each IP from a computer where censorship is expected. The final list of IP addresses is made up of only the IP addresses which reported censorship all five times. In order for one of these IP addresses to not be in the censored range, each of the five reports of censorship would have had to have been false positives, which we know to be extremely unlikely, meaning we can label these IP addresses as in the censored range.

Jan 13 '23 15:01 JaxGames5225

probe-cli probe-cli copied to clipboard

feat(engine): add randomtraffic experiment

Checklist

Description

Censorship Description

Test Goals and Procedure

False Negative and False Positive Rates

IP List Construction

probe-cli
probe-cli copied to clipboard