floc icon indicating copy to clipboard operation
floc copied to clipboard

Does the lack of true federation create a trust/maintenance blocker for FLoC?

Open AramZS opened this issue 5 years ago • 1 comments
trafficstars

The test posted by the Google team proposes that FLoC is not usable without either a SortingLSH sorting server or a SimHash anonymity server, both of which create centralization (this centralization is noted as loose in both cases, but I am not familiar enough with the related process and math to understand just how loose these cases are) which was not what I was hoping for on reading the initial proposal for FLoC.

This looks, to me, to be potentially a major source of contention or opportunity depending on how it is approached?

I think the first question is: is true federation for this proposal impossible? The initial proposal was, I understand, highly theoretical and perhaps what we are discovering is that, in practice, there is no way to provide the utility and provide a reasonable guarantee of privacy without some centralized server in the mix. The section on affinity hierarchical clustering seems to indicate that there may be some future federation option that could eliminate centralization, but it does not specify.

Do the proposers currently believe that it is likely that this product would go live with a requirement for a central server? Or is that requirement considered a blocker?

If a central server is not considered a blocker, this presents a big question:

Who provides this central server?

Assuming that the techniques described would assure that a central server receives no data which de-anonymized users (a fair assumption based on the description), there are still trust problems:

  1. That the entity running the service is trusted to not attempt to leverage the data to shape the market in any way.
  2. That we trust that the entity will continue to exist in a form that allows it to maintain these servers.
  3. That the server's costs are supported.

From these issues I think we have some smaller questions:

Does providing the server potentially give the entity running it an opportunity to participate in the ad process (ex: accounting for ad calls to take a percentage of revenue per-ad)?

Does the user have have the capacity to select servers?

Do the sites running ads have the capacity to select servers?

Could service providers hosting the servers for processing FLoC cohorts differentiate themselves based on the question of utility vs anonymity within particular limits? (Ex: can a server advertise itself as more private by requiring larger cohorts and allow users or user agents to select it on that basis?)

Thanks!

AramZS avatar Nov 05 '20 21:11 AramZS

Hi Aram,

I guess your statement "in practice, there is no way to provide the utility and provide a reasonable guarantee of privacy without some centralized server in the mix" depends on what you mean by "reasonable guarantee". One conclusion that I drew from that paper is that without any kind of server, we could base FLoC on SimHash, and expect to offer good privacy to most people (the paper used 98% for "most"), for example.

But incorporating a server allows better privacy protections, and better privacy-utility trade-offs. Since we expect the Privacy Sandbox effort to include server-side infrastructure for privacy-safe aggregated measurement, it seems consistent to reuse that infrastructure to make better flocks as well.

Certainly the questions relating to "who runs the servers" will need answers. But I'd say those answers are more urgent for a wide range of non-FLoC use cases. For our purposes, we can assume this infrastructure is available at negligible cost.

michaelkleber avatar Nov 13 '20 21:11 michaelkleber