bibxml-service icon indicating copy to clipboard operation
bibxml-service copied to clipboard

Canonical citation identifiers/URLs

Open strogonoff opened this issue 3 years ago • 11 comments

Originally, we used source dataset references as canonical identifiers. Those are effectively filenames.

It was suboptimal, because those filenames aren’t themselves canonical, aren’t guaranteed to be universally unique (only unique within a dataset), a single citation could be comprised of data from multiple datasets, and generally these references are implementation details/artifacts of citation sourcing logic and can change (external tools that collect citation data aren’t concerned with consistency of URLs, nor should they be).

Now, we’re switching to document identifiers.

However, the problem is that one citation/bibliographic item can have multiple identifiers. One of type DOI, one of type ISBN, and they can even have multiple identifiers of the same type formatted slightly differently (e.g., IETF: https://github.com/ietf-ribose/relaton-data-rfcsubseries/issues/4).

  • One implication is that, if we depend on document identifiers, we can have multiple (sometimes nearly identical) URLs leading to the same citation (/types/IETF/RFC+3972/, /types/IETF/RFC3972/ and so on). This at least makes BibXML service a somewhat badly behaved Web citizen.
  • We could treat the first identifier as canonical and always redirect to it, but it’s unclear whether Relaton model gives any significance to identifier order of appearance (if it’s rooted in XML, it probably doesn’t because ordering doesn’t matter there). And if the order changes due to external tool implementation details, it’ll break canonical URLs.

One way would be to come up with our own identifier. It could be derived from citation data in some way (but that makes it liable to break if citation data changes), or it could be truly random like an UUID and we’d have indexing logic ensure it stays the same. I’m looking into this option.

strogonoff avatar Jan 06 '22 19:01 strogonoff