kuma icon indicating copy to clipboard operation
kuma copied to clipboard

T - Investigate mitigation for GA bogus traffic with no clientID

Open tobinmori opened this issue 4 years ago • 4 comments

The task here is investigation only, and reporting back in comments how difficult this would be to fix.

Possible solutions:

  • inject client_id into links
  • check HTTP Referrer

tobinmori avatar Apr 01 '20 18:04 tobinmori

* check HTTP Referrer

This has already been implemented so it can't be commented on how difficult it is.

So what's left? The way you attach a client ID from GA to those sign-in links is that you wait, in client-side, for the GA object to be initialized and then you append a ?clientId={value} to the links that power the buttons on the auth modal (and the auth landing page). From there it's pretty easy to just inject it in kuma.core.ga_tracking.track_event function, which already supports it as an option. The tricky thing might be to inject ourselves before allauth.

So, instead of investigating, how about if I just go ahead and do it? And if it takes longer than 1 "full uninterrupted day", I simply quit and report back.

CC @atopal

peterbe avatar Apr 06 '20 19:04 peterbe

Sounds good to me Peter. Maybe in the next sprint?

atopal avatar Apr 09 '20 12:04 atopal

Pardon my ignorance, but suppose that we can get the clientID into some of the Python-base GA events. It'll only really be for the auth started one, right? What does that mean for other Python-based events that don't really happen as an effect of clicking things on our website. I.e. when the auth providers redirect back to us, or when we notice that a user "cross-benefitted" from having a matching email address on the other provider.

peterbe avatar Apr 13 '20 13:04 peterbe

@peterbe It's not really important that the event is an effect of clicking things on our website. Things can happen offsite, but ideally they are linked to a session with a clientID. For example, our task completion survey is on Survey Gizmo. When users submit the survey on survey gizmo, we pipe those events back into GA with the original clientID. None of those events had anything to do with the MDN site.

atopal avatar Apr 14 '20 13:04 atopal