h icon indicating copy to clipboard operation
h copied to clipboard

Write up documentation on how document URIs, document equivalence and document metadata work

Open robertknight opened this issue 4 years ago • 3 comments
trafficstars

A perennial cause of confusion in Hypothesis is how document metadata is managed and used. This includes topics such as:

  • What is URI normalization, how does it work and what are the consequences?
  • What document URIs are captured by the client and stored in h
  • What document metadata is captured by the client and stored in h
  • What affects the annotations that are fetched when annotating a particular document
  • What changes to a document will "break" the link between annotations and its document
  • How can incorrect document metadata be corrected
  • How can document metadata be updated if URLs are moved to a new location

Audiences for this documentation, or parts of it, include:

  • Hypothesis developers, especially new ones trying learning about the system
  • Customer support staff who need to understand at least parts of this to assist users
  • End users who are trying to understand how Hypothesis associates annotations with URIs

Since these audiences will be interested in understanding this at different technical levels, it might make sense to write up documentation in the docs/ dir in the h repo which contains a technically detailed version maintained by H engineers. The support team can then distill that down to information that is easier for end-users to understand as necessary.

robertknight avatar Apr 06 '21 15:04 robertknight

I would add:

  • <link rel="canonical" href="...." /> vs <link rel="alternate" href="...">
  • Some document are made equivalent after a new annotation is added while for other documents this seems not necessary.

esanzgar avatar Apr 06 '21 15:04 esanzgar

Are webpages with the same title, and the same hostname (e.g. lesswrong.com) and protocol (e.g. https) considered equivalent?? EDIT: third guess: probably not? EDIT: I have read google / site:web.hypothes.is equivalence https://web.hypothes.is/help/how-to-establish-or-avoid-document-equivalence-in-the-hypothesis-system/ https://web.hypothes.is/help/how-hypothesis-interacts-with-document-metadata/

martin12333 avatar Jun 06 '21 17:06 martin12333

Are webpages with the same title, and the same hostname (e.g. lesswrong.com) and protocol (e.g. https) considered equivalent??

The title is not considered. The URL (hostname, protocol, pathname, query params, but not fragment) are taken into account, with some exceptions:

  • http and https are considered the same
  • Some common query params are ignored (those added to links by Google Analytics or FB)

robertknight avatar Jun 06 '21 20:06 robertknight