nft.storage icon indicating copy to clipboard operation
nft.storage copied to clipboard

Gateway works with any URL

Open jchris opened this issue 3 years ago • 10 comments

Since we are archiving NFT assets from centralized URLs as well, we should offer an easy path for gallery and wallet authors to leverage our CDN, even for those URLs. In addition to a standard API to lookup assets by NFT, we could offer a sweet no-code alternative.

The idea is simple:

GET  https://anyurl.nftstorage.link/https://example.com/1.png 

should do one of two things. If we have the content already saved to NFT.storage, it should 301 redirect to the CID gateway url. If we haven't archived the content, then we 302 redirect to the original URL.

The important part is enabling developers to write the following code without second thought:

let fastUrl = "https://anyurl.nftstorage.link/" + originalUrl;

It's OK if the fast case only happens for cid urls at first. The important part is telling developers they can add that code without breaking their app.

Example of trivial case when originalUrl is ipfs://bafy.../foo/1.png (something we hope to find commonly on-chain):

GET https://nftstorage.link/ipfs://bafy.../foo/1.png
301 https://bafy…ipfs.nftstorage.link/foo/1.png

Without further heavy lifting code, we can make this work today when originalURL is not a CID, eg https://example.com/1.png by redirecting to the source.

GET https://nftstorage.link/https://example.com/1.png 
302 https://example.com/1.png 

Over time as we add features, we can redirect more and more classes of URLs to their cached CIDs.

jchris avatar Feb 21 '22 21:02 jchris

A benefit of structuring it this way is that page authors can naively construct URLs, and we can do the right thing. So their code just looks like

let fastUrl = "https://nftstorage.link/" + originalURL

And it will work just fine even if original URL is an ipfs:// or gateway link.

let fastUrl = "https://nftstorage.link/ipfs://CID"

jchris avatar Feb 21 '22 21:02 jchris

Tagging @vasco-santos @Gozala @JeffLowe for feedback. Thanks!

jchris avatar Feb 21 '22 22:02 jchris

Love it! I've been suggesting doing same thing for ipfs urls across all gateways as well!

P.S. I would even drop nft-data-by-url part

Gozala avatar Feb 22 '22 04:02 Gozala

Yeah, I think supporting this would be super nice for transition!

put it the 302s into a queue for archive

That's an interesting angle. It is important to consider abuse of course, but this would be super sweet to onboard data (specially some data where it is cheap to just compute CIDs).

A side note, we are not using gateway.nft.storage domain. So, I edited all messages in this issue to the domain we will be using nftstorage.link to not create confusion.

vasco-santos avatar Feb 22 '22 09:02 vasco-santos

The idea is simple:

GET  https://nftstorage.link/nft-data-by-url/https://example.com/1.png 

should do one of two things. If we have the content already saved to NFT.storage, it should 301 redirect to the CID gateway url. If we haven't archived the content, then we 302 redirect to the original URL.

(Future enhancement: put it the 302s into a queue for archive, but we run the risk of people inserting a lot of non NFT content.)

How does the data at https://example.com/1.png get into NFT.Storage? It seems that we'll always issue a 302 redirect until we do the "Future enhancement".

Also for the "future enhancement" we'll have to maintain a mapping of URL->CID for all URLs we ingest...right? So when we get a request we can redirect to the gateway URL?

I have concerns around setting up a service that recieves a user provided URL and then sends a request to that URL, it seems like an easy way to get NFT.Storage to perform a DDoS attack. We'd have to be sure to mitigate against that.

Suggestion: I think a client method that takes a URL, requests the data, stores it in NFT.Storage and gives you back the CID would be useful to developers.


Related aside: a long time ago we talked about centralized URLs niftysave encountered and having a way of resolving them after they've been ingested using MFS. Basically they'd be added to MFS to a path that's derivable from the centralized URL so in the given example URL you'd maybe resolve the CID like:

$ ipfs files stat /https/example/com/1.png
QmZLyRxvpBgeGif8DHsvEuk2tRYsJo6JSK1QWwTGFtwbcL
Size: 430324
CumulativeSize: 430456
ChildBlocks: 2
Type: file

So then you'd be able to stat the MFS root to get the CID for all NFTs and resolve that on a gateway e.g.:

$ ipfs files stat / --hash
QmXPwTnSvMK9vvnZNXif3GwmfqeFgMF2bvK8zdhqeTdkxL

$ open https://ipfs.io/ipfs/QmXPwTnSvMK9vvnZNXif3GwmfqeFgMF2bvK8zdhqeTdkxL/https/example/com/1.png

alanshaw avatar Feb 22 '22 13:02 alanshaw

We might need a super fast index for URL -> fetched CIDs, like keeping them in a memcached or something.

jchris avatar Feb 23 '22 17:02 jchris

This is really cool idea @jchris! In case you want to poke at prior art, Internet Archive supports something similar:

  • https://web.archive.org/http://amturing.acm.org/p558-lamport.pdf

lidel avatar Feb 24 '22 19:02 lidel

related to https://github.com/nftstorage/nft.storage/issues/1045, only really useful once we have niftysave data stored and mapping data structure from centralized URLs -> CIDs accessible

JeffLowe avatar Mar 03 '22 15:03 JeffLowe

This would be a good answer here when it is ready image

jchris avatar Mar 06 '22 01:03 jchris

We can do this today by just redirecting through any URL we don't have an easy answer for. Eg non CID urls can be 302.

jchris avatar Mar 29 '22 18:03 jchris