iroh resolver: add a raw gateway source

The resolver fetches CIDs in https://github.com/n0-computer/iroh/blob/a77a2f62fac52a21167fdf7628ce68a4cca7ea6f/iroh-resolver/src/resolver.rs#L372 using two different sources:

The store is queried for the CID.
if that failed, the p2p module gets the opportunity to get it using bitswap.

I'd like to introduce a 3rd source: fetching the raw CID content from one or several gateways. That will be useful when running iroh on mobile devices where we could limit the p2p mode to mdns.

I'm not sure about the strategy to use here though. Should we race the gateways with p2p?

Aug 11 '22 22:08 fabricedesre

Hmmm I don't think that's a really great option for a couple reasons:

the gateway is just a thin shell for the p2p node
p2p/storage should be used as libs by other implementations and feel free to tack on "extra" functionality as they wish, our gateway is pretty much a live example and a good tool for apples to apples comparison
from a more technical standpoint not sure how good that would be for the network, imagine if every implementation did that and you fetched some CID for which the node went down - you would potentially skip checking for providers because the other gateways would resolve for themself, they would fail to fetch via p2p and request from others and the cycle repeats.

A better alternative would be using p2p for what it is (ie it technically already "speaks" with other gateways/providers) since that's the underlying protocol for them all and it's all part of the same network. At best (and it's on the roadmap) you could spread load across multiple p2p nodes for various reasons (different dht setups, load balancing, regional/latency reasons etc)

Aug 12 '22 11:08 Arqu

This proposal doesn't change the store & p2p nodes, only the resolver which orchestrate them to add a 3rd way to get a content addressed blob. That would be transparent to the gateway itself also.

Currently once a CID is fetched from p2p with bitswap we put it in the store, and that would be similar here.

If your argument is that this should be added to p2p instead of the resolver, why not. It looks like this would behave similarly.

Aug 12 '22 14:08 fabricedesre

I implemented that idea in https://github.com/capyloon/iroh/commit/70bd96df3e1c581b36e2263a38a6c7696fff075f and that works pretty well.

Aug 15 '22 16:08 fabricedesre

It will work for now.

The thing is if everyone did this we would be in a recursive loop with all the gateways (baring some depth limiting). It's also not needed and doesn't really achieve much since they all talk to the same network.

Ie

Iroh --------------------
                        |
dweb ------------>    IPFS   <--------------- ipfs.io
                        |
cloudflare ------------  -------------------- other gateways

They all talk via libp2p. The issue you currently have is that our p2p code is still very basic and we don't resolve all too well yet. @dignifiedquire is working on that currently and you should see some improvements soon. With those it really doesn't make sense to ping gateways as you're already talking to the entire network on the p2p layer. (except maybe some private p2p networks but that's also solved differently)

Aug 16 '22 07:08 Arqu

I think we'll have to agree that we disagree... but 2 last points:

It should not matter at all where you get the content for a CID from since its integrity can be verified: block store, p2p, raw gateway, pigeon carrier etc. They are all just sources, and if you fear about loops that's an implementation issue that needs to be addressed.
I want to use Iroh on mobile. In that context, the current DHT is basically unusable because you can't afford to open dozens of persistent connections with the associated traffic. So we need something more in line with local network discovery + reliable fetch from other sources.

Aug 16 '22 16:08 fabricedesre

Ok... so I might have been viewing all of this from the wrong lens (the non mobile lens).

the source of the CID is irrelevant and you're right, my point has been that fetching the data from the p2p layer is basically the same "pool" of data, the carrier is not important
however not having enough juice for running a DHT and tons of connections on mobile makes a lot of sense. Which takes us back to the start of this.

You essentially want to have a setup of "racing" or at least round robin / priority list of gateways to ping for a given CID. in an ideal world you shouldn't need more than 1, maybe 2 as a fallback if all the gateways were fast and reliable and resolved everything well.

The confusion came from wanting to put this code in the resolver, which is not where it should live though works for the specific use case as you're not hosting your own gateway but rather single user nodes. The proper place for this logic is definitely in the "gateway" code for your current fork mainly because that is our application layer. Also in this case it should no longer be a gateway but something else, lets call it mobile node.

Now depending on what you need the mobile node might or might not expose a route you can hit (maybe you just want to peer with other mobile nodes but need http to talk because p2p is no no on mobile for whatever reason) that route is independent of the code that executes towards the other side of the pipe where you might have something like:

#route=/mobile-node/ipfs/CID
get_handler(cid,..) -> bytes {  --- this is your outside route handler on the mobile node
    ...
    r = try_fetch_store(cid).await  --- this is local cache in your case
    if !r {
        r = try_fetch_p2p(cid).await --- this is the global ipfs network, so no affiliation to our gateway just running iroh code
        k = try_fetch_othergw(cid).await --- any number of gateways to ping, we might want to do this in parallel and whatever returns because p2p is bad/slow on mobile
    }
    if r then return r 
    if k then return k
    ...
}

If you peel down the layers of the gateway, putting things in the resolver does exactly that but IMHO this should be further up the stack as the resolver is closer to the core tech than to app stack. It's also not something we could merge into the main branch as it conflicts with other use cases.

What could be done on this front is maybe allow for a generic resolver trait which you could implement and we make the resolver able to be built up with different resolver components ie:

iroh_resolver::Builder::new()
    .with(store_resolver_comp)
    .with(p2p_resolver_comp)
    .with(gateway_resolver_comp)
    .build()

And have the code internally abstract over those so you can just plug more.

The current code already leans into that direction with the ContentLoader in the resolver, though there is currently just one implementation of the client and the builder is not as smooth and as abstracted as outlined above. For a one off you could have a fork of the client with your modifications while eventually the resolver gets "pretty" and easy to use.

Aug 16 '22 18:08 Arqu

I'm fine with not pushing that to iroh-one yet - I can always add that to a fork anyway.

Aug 16 '22 18:08 fabricedesre

iroh iroh copied to clipboard

resolver: add a raw gateway source

iroh
iroh copied to clipboard