ai-web-impact icon indicating copy to clipboard operation
ai-web-impact copied to clipboard

Public Provision

Open darobin opened this issue 3 months ago • 1 comments

We are thankful that you are producing this document, which we believe has genuine potential to shape better outcomes for AI and the web.

One aspect which you don’t address is the need for public provision of AI. Over the past decades we have witnessed the consequences of having many key infrastructural components of the web and of our digital lives being entirely provided by private actors; we believe that it is essential to establish the public provision of an AI commons today, before privatized infrastructure captures the market and becomes entrenched.

With this in mind, we would like to suggest the addition of a section to your document to cover this concern, maybe like the one below (happy to make a PR if that’s helpful). (Note: the fenced section is meant to match the .advisement sections in the doc.)

@thelastjosh & @darobin (Public AI Network)

Public Provision

A recurring issue on the web is that several of its critical infrastructure components, such as search and social, are entirely provided by a highly concentrated set of private actors who operate without accountability to their users. This brings the web out of alignment with its users’ needs and makes it challenging for the web to deliver on the ethical, human-centric elements of its mission.

As AI establishes itself as an important part of the web, we need to ensure that we do not repeat the mistakes of previous eras. The web is a commons and we need the web and web standards to support a thriving, commons-based production of AI systems. This commons-based production includes local and public provision of AI components (models, data, etc.) from such actors as territorial states, cities, or any open project with democratic governance. For that to happen, we need to ensure that such actors can support or create, at a sustainable cost, AI applications and services and that these can remain competitive over time.

The signs of a healthy commons to establish for AI on the web include:

  • a thriving ecosystem of AI models, not just a dominant one or two or three that serve as oracles for all of humankind,
  • equal access to web-scale data,
  • the absence of extractive methods centered on using personal data to train models, and
  • people having a meaningful say on how any AI that they rely on or that impacts their lives works.

Web standards can reinforce or reinstate the web's status as a public commons could (and, arguably, should) lead to a version where AI itself becomes a kind of public commons, rather than a privately operated service like a search engines or social networks. This may require resolving the tension between making the web a better place for people versus making the web a better place for computers (including AI).

One plausible role for the W3C is to establish metadata standards for AI. These could cover aspects 
such as whether the governance of a model is public, whether data that went into training it is public 
open data or has a specific national origin, or what principles guided the obtention of training data 
notably from a privacy standpoint.

Several public AI initiatives are focused on issues of language in LLMs. The W3C’s Internationalization 
activity could provide its expertise in support of such work, and could potentially help source content 
from a greater and better-labeled array of languages.

The use of personal data servers (PDS) could be developed, via web standards, in order to support 
people in training their own models using their personal data without having to provide it to a commercial 
actor. Public models could be produced specifically for this kind of usage, to help protect AI usage from 
data-centric business models.

darobin avatar Apr 03 '24 15:04 darobin