attribution-reporting-api icon indicating copy to clipboard operation
attribution-reporting-api copied to clipboard

False-positive event reporting conflicts with fraud prevention

Open eriktaubeneck opened this issue 4 years ago • 1 comments
trafficstars

This issue is to continue the discussion on #84, specifically on fraud prevention and false positives.

The proposed flow, as I understand it:

  1. User clicks from source to attributeon, and browser asks source for a token. Browser blinds the token and stores it. Let's call it source_token.
  2. User performs action on attributeon which they wish to trigger attribution. They call the provided API, and provide a token that the browser also linds and stores. Let's call it trigger_token.
  3. The browser performs the noise, delay, and attribution process. When it's time to send the reports, it includes both source_token and trigger_token, which act as authentication (since reports are sent to an open .well-known path.)

With this flow, every report provides a single bit of un-noised information from the attributeon to the source: that some trigger event happened on the attributeon domain. In order to prevent this, the noise mechanism should also return some reports with clicks that didn’t result in a trigger event on the attributeon domain.

However, this means only step 1 has occurred, and the browser has no way to generate the trigger_token. If the report is returned without the token (or an invalid one,) then it’s trivial to identify the “noised events”.

One potential solution discussed was a domain-level token, however that appears to have a few potential issues:

  1. If these are issued to all visitors to the attributeon domain, then it would be relatively easy for a malicious actor to accumulate tokens by visiting the site, and using those to report fraudulent conversions.
  2. Relying on any token issued by the attributeon domain reveals that session with the associated impression_id had a visit to the attributeon domain. For clicks, this isn’t new information, but for a view impression it would be.
  3. Drawing the false-positives from only sessions which visited the attributeon domain ads some bias to the false-positives, which may weaken the privacy.

Another solution is to pass all these reports through a trusted server (i.e. the aggregate reporting MPC) which validates the tokens, replaces them with its own token, and passes the event to the .well-known URL. The trust model then becomes “source and attributeon trust the MPC to honestly only pass on reports with valid tokens + the appropriate number of noised reports”. That said, this approach adds a ton of complexity.

eriktaubeneck avatar Feb 22 '21 01:02 eriktaubeneck

Thanks for summarizing this issue Erik. At a high level this matches my understanding of the problem.

@davidvancleve FYI who is starting to look at some conversion authentication issues.

csharrison avatar Feb 22 '21 03:02 csharrison