lima icon indicating copy to clipboard operation
lima copied to clipboard

Add possibility to download IPFS images

Open afbjorklund opened this issue 1 year ago • 6 comments

See https://docs.ipfs.tech/how-to/kubo-basic-cli/ for ipfs

https://docs.ipfs.tech/how-to/address-ipfs-on-web/#native-urls

ipfs://{cidv1}/path/to/resource (see also https://cid.ipfs.tech/)

Closes #2407

afbjorklund avatar Jun 09 '24 11:06 afbjorklund

The lima.yaml would then look something like:

- location: "ipfs://QmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK/ubuntu-24.04-server-cloudimg-amd64.img"
  arch: "x86_64"
  digest: "sha256:32a9d30d18803da72f5936cf2b7b9efcb4d0bb63c67933f17e3bdfd1751de3f3"

(the file name is only used for decompression)

Note that the CID digest is not the file digest:

https://cid.ipfs.tech/#QmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK 6278B63498EB92816C50A53202EE3CBEE6FC0F92F97B97CB0AB0A4AE65CCBE38

https://docs.ipfs.tech/concepts/content-addressing/#cids-are-not-file-hashes

(small detail: zQmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK in multihash format is the same as "sha256:6278b63498eb92816c50a53202ee3cbee6fc0f92f97b97cb0ab0a4ae65ccbe38" in text format)

afbjorklund avatar Jun 09 '24 11:06 afbjorklund

Note: the ipfs tool will output v0 by default, unless using --cid-version 1

v0: QmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK v1: bafybeig5sch22ecfox7gq724rz7uivydwvnnpuqdcnjz72iwelgtrakzui

$ ipfs add --cid-version 0 ubuntu-24.04-server-cloudimg-amd64.img 
added QmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK ubuntu-24.04-server-cloudimg-amd64.img
 453.00 MiB / 453.00 MiB [=================================================================================================================] 100.00%
$ ipfs add --cid-version 1 ubuntu-24.04-server-cloudimg-amd64.img 
added bafybeig5sch22ecfox7gq724rz7uivydwvnnpuqdcnjz72iwelgtrakzui ubuntu-24.04-server-cloudimg-amd64.img
 453.00 MiB / 453.00 MiB [=================================================================================================================] 100.00%

The CID version doesn't really matter to Lima, but CIDv0 is deprecated.

https://docs.ipfs.tech/concepts/content-addressing/#cid-versions

afbjorklund avatar Jun 09 '24 11:06 afbjorklund

Using ipfs cat to get our own progress, and ipfs ls to calculate the size.

Hash                                                        Size     Name
bafybeibnerap2c5tnmqvyftmyxftftjewvo52acwv2gg6thpipyw5zx7fe 45613056 
bafybeidrbml5blbqj2i67x5gujqbwluis6jy5daiso6ir5pakx4hseztum 45613056 
bafybeie56ruaziz43jj3e5iih5r3fyzudeurqbpts6vwigf7nbreoep5de 45613056 
bafybeifva7p55d4szahrrxqjdjqqqlcrcfxm7n2vnpmwucrdkvcexxngmu 45613056 
bafybeie5wduudyjkbcyjvcif7i6gliipapvl5cw4wia3fqoanxunud7dyi 45613056 
bafybeidezz5b2zng3etcwbn4vnu3tn4ao3wrh5qdwzoyo7ywdp7rbymgzq 45613056 
bafybeiaf7vcteys366fmjiclktnzwuqpygpn4u2ftqrl62juaq5sgjw4be 45613056 
bafybeiasvhz6mljylbqwygb7jvp6d6sohaw325yuan5awlo7shjrlslb7m 45613056 
bafybeieiuxw2i5nkelq5tihd2tgmtxjst2ln6rslt35jxslcz4w4mjsnha 45613056 
bafybeiest5wjnmjwyp7vgawfbityfnkgqpr27bfleysbhdrhldcwxw4yq4 45613056 
bafybeif7btzsv7657kfxu7qte6sh6zgpvpsrj7dnou7usj2nwv4zf77ft4 18874368 

So now IPFS address looks the same as HTTP address, with "description":

Downloading the image (ubuntu-24.04-server-cloudimg-amd64.img)
453.00 MiB / 453.00 MiB [---------------------------------] 100.00% 336.20 MiB/s

Instead of the output that you get from ipfs get, that also could change.

Saving file(s) to /home/anders/.cache/lima/download/by-url-sha256/1855c5dccbd6db83ea6c81c276e0440ad9f156584e2ced824290186f1dae563b/data
 453.00 MiB / 453.00 MiB [==================================================================================] 100.00% 0s

afbjorklund avatar Jun 10 '24 10:06 afbjorklund

At first I was thinking that calculating the digest was unnecessary, since it already has one included in the storage.

But we still want to compare the download with the digest we are expecting, to make sure it's the same image...

afbjorklund avatar Jun 10 '24 10:06 afbjorklund

Looks good, but needs a documentation

AkihiroSuda avatar Jun 19 '24 05:06 AkihiroSuda

There is a design flaw with this approach. Currently it would look like:

- location: "https://cloud-images.ubuntu.com/releases/24.04/release-20240423/ubuntu-24.04-server-cloudimg-amd64.img"
  arch: "x86_64"
  digest: "sha256:32a9d30d18803da72f5936cf2b7b9efcb4d0bb63c67933f17e3bdfd1751de3f3"
- location: "ipfs://QmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK/ubuntu-24.04-server-cloudimg-amd64.img"
  arch: "x86_64"
  digest: "sha256:32a9d30d18803da72f5936cf2b7b9efcb4d0bb63c67933f17e3bdfd1751de3f3"

That means the checksum is duplicated, between the transports. Maybe:

- location: "https://cloud-images.ubuntu.com/releases/24.04/release-20240423/ubuntu-24.04-server-cloudimg-amd64.img"
  arch: "x86_64"
  digest: "sha256:32a9d30d18803da72f5936cf2b7b9efcb4d0bb63c67933f17e3bdfd1751de3f3"
  cid: QmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK

Note:

The CID can be calculated, without adding the image to the disk store:

$ sha256sum ubuntu-24.04-server-cloudimg-amd64.img 
32a9d30d18803da72f5936cf2b7b9efcb4d0bb63c67933f17e3bdfd1751de3f3  ubuntu-24.04-server-cloudimg-amd64.img
$ ipfs add --only-hash --quieter ubuntu-24.04-server-cloudimg-amd64.img 
QmUy3RRqbpsxXYQ7yp4h2koFHdGdVTCfRiqCBLkw1JobUK

afbjorklund avatar Jun 20 '24 14:06 afbjorklund

Unfortunately, Content-Type and Last-Modified are not provided by IPFS...

Current gateway only use heuristics like file magic and relative freshness.

Note: new implementations are supposed to use CID version 1 (not version 0):

$ ipfs add --cid-version=1 --only-hash --quieter ubuntu-24.04-server-cloudimg-amd64.img 
bafybeig5sch22ecfox7gq724rz7uivydwvnnpuqdcnjz72iwelgtrakzui

afbjorklund avatar Jul 29 '24 08:07 afbjorklund

Looks good, but needs a documentation

Currently it assumes that IPFS Kubo is set up.

i.e. that ipfs add and ipfs get is working

https://docs.ipfs.tech/how-to/kubo-basic-cli/


This might also want to mention some experimental features, like using private networks:

https://github.com/ipfs/kubo/blob/v0.30.0/docs/experimental-features.md#private-networks

For testing purposes, you can use ipfs daemon --offline to avoid connecting to the swarm.

See also docs at: https://github.com/containerd/stargz-snapshotter/blob/main/docs/ipfs.md

afbjorklund avatar Sep 28 '24 12:09 afbjorklund

Could also add support for IPFS_GATEWAY, as a fallback?

https://blog.ipfs.tech/ipfs-uri-support-in-curl/

If set, it would rewrite any ipfs: into http/https instead...

export IPFS_GATEWAY="http://127.0.0.1:8080"

afbjorklund avatar Oct 09 '24 09:10 afbjorklund

Regarding addressing of ipfs objects, there are some more details here: https://github.com/ipfs/in-web-browsers/blob/master/ADDRESSING.md

"The four stages of the upgrade path for path addressing."

  1. Current: HTTP-to-IPFS gateway e.g. https://ipfs.io/ipfs/$hash
  2. Short term: URL e.g. ipfs://$hash
  3. Mid term: URI e.g. dweb:/ipfs/$hash
  4. Long term: NURI e.g. /ipfs/$hash

So it is simpler to only provide the CID/hash, as additional information? https://docs.ipfs.tech/concepts/content-addressing/#what-is-a-cid

Since it is a multiformat/multihash, it doesn't need an additional prefix*.

* it actually already has a couple of them, but in a string encoded form (one can use ipfs cid format, or https://cid.ipfs.io/, to decipher them)

cid: bafybeieipdaxd3fzy3j7syzzxdaqxramk65j7ajzcqgmi6b5jyq4jgbwue
# base32-cidv1-dag-pb-(sha2-256:32:8878C171ECB9C6D3F96339B8C10BC40C57BA9F8139140CC4783D4E21C49836A1)

afbjorklund avatar Oct 09 '24 16:10 afbjorklund

More user-facing documentation (for the website) can go in a second PR. Maybe just use the links above?

https://docs.ipfs.tech/how-to/kubo-basic-cli/

Also needs documentation on how to update images, then again we don't have any docs for sha256 either...

- location: https://github.com/containerd/nerdctl/releases/download/v1.7.6/nerdctl-full-1.7.6-linux-amd64.tar.gz
  arch: x86_64
  digest: sha256:2c841e097fcfb5a1760bd354b3778cb695b44cd01f9f271c17507dc4a0b25606
  cid: bafybeieipdaxd3fzy3j7syzzxdaqxramk65j7ajzcqgmi6b5jyq4jgbwue

Like, when updated to 1.7.7 - how do you update the other fields?

$ wget https://github.com/containerd/nerdctl/releases/download/v1.7.7/nerdctl-full-1.7.7-linux-amd64.tar.gz
...
HTTP request sent, awaiting response... 200 OK
Length: 259844835 (248M) [application/octet-stream]
Saving to: ‘nerdctl-full-1.7.7-linux-amd64.tar.gz’
$ sha256sum nerdctl-full-1.7.7-linux-amd64.tar.gz
a731eac93e8e9dda1a0d76dc1606438deb0668ea7d6bd5c5af436353ed9f65c5  nerdctl-full-1.7.7-linux-amd64.tar.gz
$ ipfs add --only-hash --cid-version=1 --progress=false nerdctl-full-1.7.7-linux-amd64.tar.gz 
added bafybeiexmdvas4d3dy3npvecj3udihifaqndhelpiyjb67zbsm3g5eqlba nerdctl-full-1.7.7-linux-amd64.tar.gz

afbjorklund avatar Oct 09 '24 18:10 afbjorklund

"cidsum" wrapper:

#!/bin/sh
ipfs add --only-hash --cid-version=1 --progress=false "$@"

Something like:

$ sha256sum ubuntu-24.04-server-cloudimg-*.img
0e25ca6ee9f08ec5d4f9910054b66ae7163c6152e81a3e67689d89bd6e4dfa69  ubuntu-24.04-server-cloudimg-amd64.img
5ecac6447be66a164626744a87a27fd4e6c6606dc683e0a233870af63df4276a  ubuntu-24.04-server-cloudimg-arm64.img
$ cidsum ubuntu-24.04-server-cloudimg-*.img
added bafybeievi2i673t7kgzx6vsxc6lod3bvagzc364e6kes43p6catfowgone ubuntu-24.04-server-cloudimg-amd64.img
added bafybeich5idqhqtdh3f4nxq6kv6l2fd2nsvynha3vdqek4a7mu7nawp6sm ubuntu-24.04-server-cloudimg-arm64.img

afbjorklund avatar Oct 22 '24 13:10 afbjorklund

We might want to refactor downloader separately, then it would be easier to add handlers...

i.e. ipfs for ipfs: and oras for oras:, and what other alternatives to plain http: and https:

afbjorklund avatar Oct 22 '24 14:10 afbjorklund