attribution-reporting-api
attribution-reporting-api copied to clipboard
False-positive event reporting conflicts with fraud prevention
This issue is to continue the discussion on #84, specifically on fraud prevention and false positives.
The proposed flow, as I understand it:
- User clicks from
sourcetoattributeon, and browser askssourcefor a token. Browser blinds the token and stores it. Let's call itsource_token. - User performs action on
attributeonwhich they wish to trigger attribution. They call the provided API, and provide a token that the browser also linds and stores. Let's call ittrigger_token. - The browser performs the noise, delay, and attribution process. When it's time to send the reports, it includes both
source_tokenandtrigger_token, which act as authentication (since reports are sent to an open.well-knownpath.)
With this flow, every report provides a single bit of un-noised information from the attributeon to the source: that some trigger event happened on the attributeon domain. In order to prevent this, the noise mechanism should also return some reports with clicks that didn’t result in a trigger event on the attributeon domain.
However, this means only step 1 has occurred, and the browser has no way to generate the trigger_token. If the report is returned without the token (or an invalid one,) then it’s trivial to identify the “noised events”.
One potential solution discussed was a domain-level token, however that appears to have a few potential issues:
- If these are issued to all visitors to the
attributeondomain, then it would be relatively easy for a malicious actor to accumulate tokens by visiting the site, and using those to report fraudulent conversions. - Relying on any token issued by the
attributeondomain reveals that session with the associatedimpression_idhad a visit to theattributeondomain. For clicks, this isn’t new information, but for a view impression it would be. - Drawing the false-positives from only sessions which visited the
attributeondomain ads some bias to the false-positives, which may weaken the privacy.
Another solution is to pass all these reports through a trusted server (i.e. the aggregate reporting MPC) which validates the tokens, replaces them with its own token, and passes the event to the .well-known URL. The trust model then becomes “source and attributeon trust the MPC to honestly only pass on reports with valid tokens + the appropriate number of noised reports”. That said, this approach adds a ton of complexity.
Thanks for summarizing this issue Erik. At a high level this matches my understanding of the problem.
@davidvancleve FYI who is starting to look at some conversion authentication issues.