in-web-browsers icon indicating copy to clipboard operation
in-web-browsers copied to clipboard

RFC6249: Metalink/HTTP: Mirrors and Hashes

Open lidel opened this issue 4 years ago • 2 comments

This is a more powerful alternative to Alt-Svc discussed in https://github.com/ipfs/in-web-browsers/issues/144 There is also IETF draft with Content-Digest and Want-Content-Digest headers – we track that in #185, but this one seems to be more flexible.

RFC6249 enables metalink hints to be returned as HTTP response headers:

1. Introduction

Metalink/HTTP is an alternative and complementary representation of Metalink information, which is usually presented as an XML-based document format RFC5854. Metalink/HTTP attempts to provide as much functionality as the Metalink/XML format by using existing standards, such as Web Linking RFC5988, Instance Digests in HTTP RFC3230, and Entity Tags (also known as ETags) RFC2616. Metalink/HTTP is used to list information about a file to be downloaded. This can include lists of multiple URIs (mirrors), Peer-to-Peer information, cryptographic hashes, and digital signatures.

1.1. Example Metalink Server Response

This example shows a brief Metalink server response with ETag, mirrors, Peer-to-Peer information, Metalink/XML, OpenPGP signature, and a cryptographic hash of the whole file:

   Etag: "thvDyvhfIqlvFe+A9MYgxAfm1q5="
   Link: <http://www2.example.com/example.ext>; rel=duplicate
   Link: <ftp://ftp.example.com/example.ext>; rel=duplicate
   Link: <http://example.com/example.ext.torrent>; rel=describedby;
   type="application/x-bittorrent"
   Link: <http://example.com/example.ext.meta4>; rel=describedby;
   type="application/metalink4+xml"
   Link: <http://example.com/example.ext.asc>; rel=describedby;
   type="application/pgp-signature"
   Digest: SHA-256=9HVXcpSXzGTuTNHu/JcJIggAJSzgRWF8GzWGCMe8hgo=

Ideas how to use this on HTTP Gateways

Having this as part of HTTP spec makes it much easier for us to implement things which we always wanted, but did not want to invent IPFS-specific proprietary semantics. Below is a short list with the most obvious things, but comments with additional ideas are welcome.

(A) Return hash in Digest field to use HTTP-native semantics to enable verifiable gateway response (#128)

If we are returning a small file that fits in a single IPFS block, and was hashed with SHA (or other function supported by the web platform) we could return it as-is.

We could also return raw Multihash or a CID of entire DAG. Details would have to be determined around our plans to standardize Multihash before CID etc, broad brush strokes around something like (either MH or CID):

    Digest: SHA-256=e7EpE2zVw5H2okAeXLcxdXXc95NSJJU2vqOpN675vZw=
    Digest: MH=QmWfVY9y3xjsixTgbd9AorQxH7VtMpzfx2HaWtsoUYecaX
    Digest: CID=bafybeid3weurg3gvyoi7nisadzolomlvoxoppe2sesktnpvdve3256n5tq     

(B) URI hint that the content is available on IPFS

Opening https://en.wikipedia-on-ipfs.org/wiki/ would return mutable and immutable links to content on IPFS:

    Link: <ipns://en.wikipedia-on-ipfs.org/wiki/>; rel=duplicate
    Link: <ipfs://bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq/wiki/>; rel=duplicate

To facilitate automated fallback, the list of supported formats (https://github.com/ipfs/go-ipfs/issues/8234) could be included as well. For example, a dag-cbor CID could have:

Link: <ipfs://bafy?format=block>; rel=describedby; type="application/octet-stream"
Link: <ipfs://bafy?format=car>; rel=describedby; type="application/octet-stream"    
Link: <ipfs://bafy?format=dag-json>; rel=describedby; type="application/json"
Link: <ipfs://bafy?format=dag-cbor>; rel=describedby; type="application/cbor"

(C) URI hint that the content is available on other Peered gateways

go-ipfs already has a concept of Peering, which means friendly peers can add each other to Peering section in config and that will ensure they are always connected to each other and can engage in bitswap without the need of DHT.

I believe we could add opt-in field for a name of a subdomain gateway backed by a peer, and when present, return Link header for each "peered gateway".

For example, if dweb.link was peered with cf-ipfs.com (Cloudflare), example.com and example.net, response for https://bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq.ipfs.dweb.link could include:

    Link: <https://bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq.ipfs.cf-ipfs.com>; rel=duplicate
    Link: <https://bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq.ipfs.example.com>; rel=duplicate
    Link: <https://bafybeiemxf5abjwjbikoz4mc3a3dla6ual3jsgpdr4cjr3oz3evfyavhwq.ipfs.example.net>; rel=duplicate    

lidel avatar Feb 13 '21 14:02 lidel

I love (B). Brave could remember that for visited websites and fetch automaticaly via IPFS if the source website is down. Should probably be opt-in.

(A) could also enable torrent websites to indicate that the file is also available on IPFS. If they do, some people should be able to (aka: will) make indexes of equivalence between torrent haches and ipfs CIDs. Torrent clients could then choose trustworthy indexes to rely on, and multiply sources of fetching (even though you can't download partly from Bittorrent and partly from IPFS, alternative sources could be useful for poorly seeded files).

(D). A website like DTube could point at peers having requested the file recently, so they can seed from each other and offload the server. The server would then act more as a coordinator and a last-resort seeder rather than the main provider). Similar to (B), but with the extra step of explicitely giving lilely providers to accelerate the discovery step of IPFS fetching.

bertrandfalguiere avatar Feb 13 '21 22:02 bertrandfalguiere

See also https://github.com/ipfs/ipfs-companion/issues/1013, in the same spirit it would be nice to have a way to indicate these headers directly in the HTML (<link rel="duplicate" href="ipfs://…" />, <link rel="canonical" href="ipfs://…" />, <meta http-equiv="Link" content="<ipfs://bafy…>; rel=duplicate" /> would be possible choices. https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Link seems to say that the Link HTTP header is equivalent to the <link…/> HTML tag, so supporting the <link…/> tag in addition to the header seems desirable.

SuzanneSoy avatar Jul 09 '21 03:07 SuzanneSoy