test-lists icon indicating copy to clipboard operation
test-lists copied to clipboard

Proposal: Frequency tier for each URL

Open bact opened this issue 5 years ago • 4 comments
trafficstars

Context

  • Country list tend to grow only bigger, if we like to keep track of the censorship that may last longer than the site itself
  • More probes are mobile, which do not test the whole list in one run
  • Resulting in the sparseness of test data points. A URL may got just few tests from a large enough set of ISPs over a period of time.

Proposal

  • Has a tier number assigned to a URL. This tier number will let the OONI decides how frequent it should include this URL in a test list to be generated and distributed to a probe.
  • For example:
    • an active URL will get a normal tier and get tested every day
      • this is to get evidence of censorship that matters for normal users
    • a URL that is not longer active (dead link, moved, changed owner, etc.) may get a lower tier and get tested every three days or a week
      • this is to get evidence of censorship that matters for researchers who want to know if a censorship of the URL still going on (even the site is no longer around) or if it is already lifted.
    • additional higher tier may be added in a special situation where a very few selected URLs may needed to be probed more often
  • This will allow us to have long test list and still able to balanced the resources needed for test

Related Issues

  • #410

bact avatar Mar 17 '20 20:03 bact

Yes this is very much aligned with what we are currently planning at OONI to add support for prioritisation of URLs.

Relevant issues where we are discussing this include: https://github.com/ooni/backend/issues/312 https://github.com/ooni/backend/issues/361 https://github.com/ooni/ooni.org/issues/431

hellais avatar Mar 17 '20 20:03 hellais

Or we can have a "known_active_on" and "human_check_on" field to say about when was the last time a human maintainer see the site online and meant to be the site as it got described in the description field (from #380).

The filed value can be either:

  • DATE (YYYY-MM-DD) or
  • 0 or null or empty

if the value is 0/null/empty, it means the last time the site is known to be active/last checked is on the date of its inclusion to the test list

And from these data, OONI can automatically decides by itself how to assigned priority (without human have to do the assignment).

bact avatar Mar 18 '20 12:03 bact

I would agree to add prioritization to URLS on este lists.

For a long time I had setup custom test decks with custom lists to achieve something similar on ooni running on headless devices and by sending different lists to folks

andresazp avatar Apr 11 '20 01:04 andresazp