probe icon indicating copy to clipboard operation
probe copied to clipboard

http://crichd.sc: insufficient `#httpDiff` heuristics

Open bassosimone opened this issue 2 years ago • 2 comments

This documents insufficient #httpDiff heuristics when measuring http://crichd.sc. Measurements with v0.4 are often times affected by users using a DNS resolver different from the ISPs' one. In such a case, they end up measuring the correct website, because there is not much censorship beyond the ISP's resolver in Italy. However, with some effort, we can find a v0.4 measurement where the #httpDiff heuristics is insufficient to classify the measurement as an anomaly.

In such a measurement, the TH HTTP result is like:

image

while the probe HTTP measurement is like:

image

The x_status flag tells us that the probe determined the HTTP measurements to be equal enough. (Which is clearly wrong.)

We can perhaps get a bit more clarify around the issue if we inspect the HTTP comparison flags:

image

Here we see that the status code and the headers match. This is sufficient for saying there's no #httpDiff, even though the body is ridiculously different in size.

A v0.5 measurement likewise finds no #httpDiff here. This is no surprise, since we are using the same algorithm inside v0.5.

bassosimone avatar Sep 14 '22 09:09 bassosimone

Here's another instance of the same issue: https://explorer.ooni.org/measurement/20220911T105037Z_webconnectivity_IT_30722_n1_ruzuQ219SmIO9SrT?input=http%3A%2F%2Flivetv.sx%2F

bassosimone avatar Sep 14 '22 10:09 bassosimone

The problem is still current; see this Web Connectivity v0.5 measurement: https://explorer.ooni.org/m/20240125180856.931317_IT_webconnectivity_9ac2252d15d4dd97.

bassosimone avatar Jan 25 '24 18:01 bassosimone