privacybadger icon indicating copy to clipboard operation
privacybadger copied to clipboard

mybinder.org is broken when redirecting to notebooks.gesis.org

Open bitnik opened this issue 5 years ago • 5 comments

What is your browser and browser version?

Chrome Version 80.0.3987.122 (Official Build) (64-bit)

What is broken and where?

mybinder.org, staging.mybinder.org, example URL: https://staging.mybinder.org/v2/gh/ogrisel/notebooks/mastersss

What is the "culprit" domain?

notebooks.gesis.org

What is your debug output for this domain?

To get the debug output, please see the instructions link above.

**** ACTION_MAP for gesis.org
gesis.org {
  "dnt": false,
  "heuristicAction": "block",
  "nextUpdateTime": 0,
  "userAction": ""
}
notebooks-test.gesis.org {
  "dnt": false,
  "heuristicAction": "",
  "nextUpdateTime": 1565035835885,
  "userAction": "user_cookieblock"
}
notebooks.gesis.org {
  "dnt": false,
  "heuristicAction": "block",
  "nextUpdateTime": 1582943208262,
  "userAction": "user_block"
}
**** SNITCH_MAP for gesis.org
gesis.org [
  "127.0.0.1",
  "0.0.0.0",
  "github.com"
]

https://mybinder.org is a federation of different public BinderHub deployments. It redirects each user to a federation member. Last week we realized that PB is blocking notebooks.gesis.org, which is also a federation member and on notebooks.gesis.org there is no trackers running at all.

Today I was reading this issue and after reading the comment

To follow up on the last comment: if you've had PB installed since before we added the MDFP list, your badger might have learned to block wikimedia domains before that time, and never "un-learned" to block them if you didn't reset Privacy Badger's local storage. We can fix that problem by adding a migration to Privacy Badger's startup procedure, as we have with the yellowlist.

I decided to uninstall PB and install it again and see what happens again. And what happens is that it doesn't block notebooks.gesis.org anymore, which is very good. But we would like to ask you if there is a general fix for this issue, so it is fixed for all users without being have to make a fresh install of PB.

bitnik avatar Feb 26 '20 09:02 bitnik

Hello!

Could you provide steps for Privacy Badger learning to block notebooks.gesis.org? I am not sure how this problem happens right now.

I do see that two of the three domains your Privacy Badger saw "tracking" from notebooks.gesis.org on are local addresses. We should probably disable learning when on localhost pages (#817).

This seems similar to #2480 (another federated service).

ghostwords avatar Feb 26 '20 19:02 ghostwords

Hey, thanks a lot!

Could you provide steps for Privacy Badger learning to block notebooks.gesis.org? I am not sure how this problem happens right now.

Before the fresh install of PB, when I visit https://staging.mybinder.org/v2/gh/ogrisel/notebooks/master, PB was blocking notebooks.gesis.org and therefore mybinder.org couldn't redirect to notebooks.gesis.org. After the fresh install notebooks.gesis.org is now just green in PB and redirection happens. (Btw to reproduce this, maybe you should first clear cookies for staging.mybinder.org in your browser, because mybinder.org saves the host, where user is going to be redirected, in a cookie.)

bitnik avatar Feb 27 '20 10:02 bitnik

I just realised an important detail: now I see that notebooks.gesis.org is under title "Your Badger hasn't yet learned to block these domains"

bitnik avatar Feb 27 '20 16:02 bitnik

The debug output for gesis.org looks like:

**** ACTION_MAP for gesis.org
gesis.org {
  "userAction": "",
  "dnt": false,
  "heuristicAction": "block",
  "nextUpdateTime": 0
}
notebooks.gesis.org {
  "userAction": "user_block",
  "dnt": false,
  "heuristicAction": "block",
  "nextUpdateTime": 1582572413865
}
**** SNITCH_MAP for gesis.org
gesis.org [
  "binderhub.readthedocs.io",
  "mybinder.org",
  "paddy10tellys.github.io"
]

Right now I don't know how it learnt to block notebooks.gesis.org. Will try to reconstruct this.

Some context on the setup of our federation: a user navigates to mybinder.org and JS loaded from that domain then creates a EventStream to one of a set of sites (gke.mybinder.org, gesis.mybinder.org, ovh.mybinder.org, etc). Most of the sites work. There is one that fails which is when we make a request to gesis.mybinder.org which (unlike the others) gets redirected to notebooks.gesis.org (via a 30x). This means the request starts out as something to a subdomain but then is a cross domain request. I am wondering if this what makes PB block the request.

Is this a case for MDFP? If yes I can make a PR to add mybinder.org and notebooks.gesis.org as part of the same entity. Alternatively is there any way (explicitly not sending cookies or some such, special flag during the redirect, some form of CORS like header) to make the request not trigger the tracking detection?

betatim avatar Mar 10 '20 07:03 betatim

Alternatively is there any way (explicitly not sending cookies or some such, special flag during the redirect, some form of CORS like header) to make the request not trigger the tracking detection?

Yes! If notebooks.gesis.org does not actually track users, it may be compliant with the EFF Do Not Track policy. You could then post the DNT policy on the notebooks.gesis.org domain. This will tell Privacy Badgers to always allow resources from notebooks.gesis.org.

ghostwords avatar Mar 10 '20 16:03 ghostwords