uProxy-p2p icon indicating copy to clipboard operation
uProxy-p2p copied to clipboard

Write uProxy client code for reporting

Open fortuna opened this issue 9 years ago • 6 comments

This is the code that the Getter will use to report usage. For v1, we will report only for opted-in users.

On every connection, we'll report date and country, and the previous date and country. The client needs to track the previous date. We can suppress duplicate reports on the client side if there's no "bucket" change.

The client needs to keep track of failed requests in order to retry.

This depends on #2679, since that affects the API.

Should we add domain-fronting to this change? How much more work would it be?

fortuna avatar Aug 11 '16 21:08 fortuna

One thing to figure out here is how to map IPs to countries. There are a few options on GitHub:

  • https://github.com/willscott/ip2country (2M .js file, 500K .js.gz)
  • https://github.com/WebReflection/ipcc

fortuna avatar Aug 12 '16 15:08 fortuna

wow, ip2country beats MaxMind's free download by quite a bit (GeoLite2-Country.mmdb is 8.8M gzipped, 19M unzipped), nice!

FWIW, a couple ideas in case if even just 2M is too heavy:

  • prune less-likely-to-be-used data from the db if possible, settling for lower coverage / higher probability of reporting "unknown country" for some users
  • don't bundle any local db, settle for reporting only what the browser can tell us (e.g. navigator.languages - ["en-US"], timezone - (EDT), etc.), and make inferences based on that knowing we won't always be right
  • in case the lookup doesn't have to be done locally, it looks like https://www.google.com/jsapi can be used for geoip lookup, at least this is what https://github.com/codejoust/session.js does.

I guess whatever we decide will have a size/accuracy tradeoff, e.g. MaxMind's might be the most accurate of them all

jab avatar Aug 12 '16 18:08 jab

Daniel and I chatted today. Here is the specification for v1:

v1.1

  • Log for opted-in getters-only
  • Logging done by the getter using a regular connection through the sharer
  • In a first implementation, log date only, and use ZZ for "unspecified country"
  • Launch as soon as #2678 is resolved. Notice that we don't need to have the backend fully wired in order to start collecting the use dates, and we can generate reports at any time.

v1.2

  • Add country logging

Then we'll move into v2, where we add some extra privacy guarantees and log more extensively, but it's out of scope of this entry. See the v2 Milestone for that.

fortuna avatar Aug 15 '16 22:08 fortuna

(Btw recently came across http://richg42.blogspot.com/2016/08/rads-ground-breaking-lossless.html which the geoip db compression reminded me of)

jab avatar Aug 16 '16 03:08 jab

@dborkan Would you mind updating this to reflect the latest thinking on the reporting end? (this seems to reflect thinking of the previous client). Also feel free to scrap this and start a new one if it's noisy.

ghost avatar Nov 21 '16 20:11 ghost

Currently on the "metrics" branch, I have added code to the uProxy client to report when the user starts getting access. This reports the new date getting access and the last date getting access in an HTTPS request to uproxy.org (encrypted so both the sharer and a MITM can't eavesdrop).

However this is not yet on the "master" (webstore) branch, as our plans have changed in recent weeks. The new thinking is that metrics should be sent from the server, rather than the getter. This means that users only need to opt into metrics at server creation time, rather than each getter opting in. The new proposal is at https://docs.google.com/document/d/1VKA8eUr2d0BXLQOxQlqeXRihCQ-YB6L47BcdDezMVL4/

dborkan avatar Nov 21 '16 23:11 dborkan