bolts icon indicating copy to clipboard operation
bolts copied to clipboard

Trampoline Routing (2021 edition) (Feature 56/57)

Open t-bast opened this issue 4 years ago • 18 comments
trafficstars

This proposal allows nodes running on constrained devices to sync only a small portion of the network and leverage trampoline nodes to calculate the missing parts of the payment route while providing the same privacy as fully source-routed payments.

The main idea is to use layered onions: a normal onion contains a smaller onion for the last hop of the route, and that smaller onion contains routing information about the next trampoline hop.

This PR provides a high-level view of trampoline routing, where concepts and designs are presented in a more user-friendly format than formal spec work. This document lets reviewers see the big picture and how all the pieces work together. It also contains pretty detailed examples that should give reviewers some intuition about the subtle low-level details.

Then reviewers can move on to https://github.com/lightningnetwork/lightning-rfc/pull/836 which contains the usual spec format for the onion construction: this is where we'll work on the nitty-gritty details.

This PR supercedes #654 based on what we learnt after 1 year running trampoline in production in Phoenix and many discussions with @ecdsa while Electrum worked on their own trampoline implementation. The important changes are:

  • the trampoline onion is now variable-size: it's much more flexible and has no privacy downside since it's not observable at the network layer (which is the reason why the outer onion is constant size)
  • trampoline doesn't need any new gossip mechanism and instead relies on the recipient doing a small amount of work to include trampoline hints in invoices that specify the fees and cltv_expiry_delta that must be used for the last trampoline hop
  • trampoline nodes may send back an error asking the payer to retry with a higher fee or cltv_expiry_delta: since senders need at most two trampoline hops to protect their privacy (and in some cases, only one trampoline hop), this retry-on-failure approach doesn't add too much latency in practice

t-bast avatar Dec 28 '20 16:12 t-bast

Hi @t-bast can you please link to proposals/trampoline.md? I am not able to locate it.

saubyk avatar Dec 28 '20 16:12 saubyk

Hi @t-bast can you please link to proposals/trampoline.md?

https://github.com/lightningnetwork/lightning-rfc/blob/trampoline-routing-no-gossip/proposals/trampoline.md is the best way to read it (with github's markdown viewer).

t-bast avatar Dec 28 '20 16:12 t-bast

Regarding trampoline invoices: Do you suggest to create a new tagged field for trampoline_hints, or are we going to reuse the existing r tags, and to consider them as trampoline hints if the invoice has trampoline_routing_opt in its features?

ecdsa avatar Dec 31 '20 13:12 ecdsa

@ecdsa the proposed spec has a new tagged field t: https://github.com/lightningnetwork/lightning-rfc/blob/trampoline-routing-no-gossip/trampoline.md#invoice-trampoline-hints

fiatjaf avatar Jan 04 '21 04:01 fiatjaf

Yes, I think it's better to introduce a new one, tailored for trampoline. But that is of course open to discussion if there's a better way.

t-bast avatar Jan 04 '21 08:01 t-bast

Very nice proposal, can't wait to see this in action. The onion construction is mostly taken care of if you implement @rustyrussell's offers proposal (which also uses the same construction and parametrizes the size of the onion).

A rationale that I couldn't find, but I think Acinq are using this for is the efficiency gain if the sender is on a slow / high latency connection, each payment attempt requires a roundtrip to the sender (in Eclair's case a mobile phone on a potentially flaky connection). Outsourcing the path finding and retry logic to a better connected node can result in far lower time-to-completion for the payment, and the sender can save some bandwidth.

I just have the following points that I am still a bit unclear on:

  • There seem to be no limits in place for the inner onion, other than that it's smaller. Specifically we have an issue with the inner x outer path length. If we generate an inner onion consisting of nodes that are diametrically opposed in the network, we can end up with a route that is many times larger than the implicit 20 hop limit we have at the moment. The diameter of the network is currently about 7, so an inner onion with 20 hops (using the outer onion limit as a proxy here), could end up with 140 effective routing hops, which also acts as a force-multiplier for an attacker wanting to hold up as much liquidity in the network as possible.

  • Are trampoline hints intended to replace or augment the route hints? In the former case we end up in a situation where the destination must be known to at least one trampoline, in the latter we may end up duplicating a lot of information in the invoice. Further the duplication seems to be solely due to the fact that route hints don't have a feature bitset.

    • Should we just bump route hints to v2 and make them a TLV so we can add a feature bitset? It'd potentially save us some trouble in the future.
    • Do we want to have trampoline hints at all? Isn't it in the sender's interest to not have them, and instead adding the existing route hints to the last trampoline's payload? That way we don't make the invoices larger for everyone, while still maintaining their functionality.

cdecker avatar Jan 04 '21 19:01 cdecker

There seem to be no limits in place for the inner onion, other than that it's smaller. (...) which also acts as a force-multiplier for an attacker wanting to hold up as much liquidity in the network as possible.

This is true. However, attackers have no control on what channels will be used to relay with trampoline, so this would be a blind attack against the whole network rather than targetting specific nodes (which is still concerning). There has been some progress recently on proposals to fight spam, so we can hope that this issue will eventually be fixed. But in the meantime, implementations can restrict the maximum size of the trampoline onion (we can add a recommended value in the spec) just like we restricted channel capacity at first.

Are trampoline hints intended to replace or augment the route hints?

I think we want to introduce a different, more flexible invoice routing hint. It's probably a good opportunity, as you mention, to introduce the tlv format in these routing hints v2. I know @rustyrussell mentioned it multiple times in the past, do you have some requirements / early designs for these new routing hints?

Let's explore the advantages and drawbacks of using the current routing hints (let me know if I'm missing important points there):

  • Advantages:
    • Invoices don't get bigger than they are today
  • Drawbacks:
    • It takes a lot of space in the trampoline onion
    • It encourages single-trampoline payments, which is bad for privacy (trampoline node may know both sender and recipient)
    • It offers a less flexible design space for recipient anonymity schemes

It's also important to note that to receive trampoline payments, the recipient still needs to upgrade his software to support trampoline onion decryption, so it's a good opportunity to implement a new routing hint format at the same time.

The advantages and drawbacks of using new routing hints are:

  • Advantages:
    • Opportunity to add missing fields (features) and add future fields while staying backwards-compatible
    • Opportunity to remove unused fields when unnecessary, resulting in smaller invoices
    • No additional data needs to be transmitted in the trampoline onion
    • Trampoline routes must use at least two trampolines, which is much better for privacy
    • No channel information is leaked in invoices, which is better for privacy
    • More flexible design space for recipient anonymity schemes
  • Drawbacks:
    • Invoices get bigger if we want them to potentially be paid by "legacy" senders (but note that in some cases we could instead issue two invoices, a legacy one and a trampoline one, and that will eventually go away when everyone support the new format)

To exemplify the argument that the design space for recipient privacy would be bigger, I've been toying with the idea of short-lived tor-like circuits between mobile wallet recipients and trampoline nodes. Imagine we have a graph like this, where Bob wants to be paid without revealing his identity to trampoline nodes and without revealing private channel information:

                            public              private
                      T1 ------------> Carol -------------+
                                                          |
                                                          v
                                                         Bob
                                                          ^
                                                          |
      public                public             private    |
T2 ------------> Mallory ------------> Dave --------------+

Obviously if Bob uses a mobile wallet with private channels, Carol and Dave know that he is the final recipient when forwarding payments to him (frequent IP address changes, offline most of the time, etc). But Bob doesn't want his invoices to reveal that he's connected to Carol and Dave (e.g. to hide what lightning service provider he's using). Bob could send anonymous onion messages to T1 and T2 to create short-lived circuits (for the duration of the invoice validity) to get them to route payments in the right direction, without revealing more than the next hop. He then only needs to specify T1 and T2 in his routing hints, information about Carol, Dave or concrete channels is not needed.

Of course, this idea is very hand-wavy for now and has a lot of holes (and maybe route blinding works better and achieves the same results), so please don't poke at it too much yet, you will find issues, but that's not the point. I'm mentioning it because it shows how more flexible routing hints can provide a wider design space for recipients, which is important IMO.

t-bast avatar Jan 08 '21 11:01 t-bast

The current proposal adds a single new failure message, trampoline_fee_expiry_insufficient. It does not specify any new failure message if a trampoline node fails to find a route to the next trampoline in a multi trampoline route.

The current Eclair implementation seems to return amount_below_minimum in all cases, regardless of the failure scenario. For example, if I create a route with two trampolines, and the second trampoline does not exist, I receive that failure message. It would be good to specify what message a trampoline node must return in that case (does it have to be amount_below_minimum?), and it might be useful to distinguish between "next trampoline is temporarily unreachable" and "next trampoline is unknown".

ecdsa avatar Jan 08 '21 14:01 ecdsa

trampoline doesn't need any new gossip mechanism and instead relies on the recipient doing a small amount of work to include trampoline hints in invoices

The previous proposal had a node_update gossip message. That message is not equivalently replaced by the trampoline_hints in the invoice. If the sender decides to use an intermediate trampoline not in the invoice, they need to find out the fee/cltv parameters for that intermediate trampoline.

It is true that we may rely on trial-and-error and the trampoline_fee_expiry_insufficient message for that. However, I think that this induced an important change in the semantics of the fee and cltv fields in data:

  • in the previous proposal, the node_update message was used to publish fee and cltv values. By definition, these values used to be a function of time, but not of the payment sent to the trampoline.
  • in the new proposal, the fee and cltv parameters might depend on the payment. The proposal says: "The fee amount or cltv value was below that required by the trampoline node to forward to the next trampoline node." This seems to imply that the fee and cltv fields returned in data will be what the trampoline node would require for the current payment, but this does not guarantee success for a payment to another route.

I think the proposal should clarify whether the values returned in the error message are relative to the current payment, or independent of the payment. (note that I have a slight preference for fixed values. If these values depend on the payment, clients might have to do more trial-and-error in order to discover them, which might result in more htlcs flying around)

ecdsa avatar Jan 08 '21 15:01 ecdsa

The current Eclair implementation seems to return amount_below_minimum in all cases, regardless of the failure scenario

If it does that, it's a bug. It should only do that for small payments to phoenix wallets that don't have enough incoming liquidity (and thus need a new channel opened on-the-fly). For other cases it should return unknown_next_peer. It's definitely possible that this case isn't currently handled correctly since only single-trampoline has been implemented.

I think the errors returned should be the following:

  • I need more fees/cltv to relay: trampoline_fee_expiry_insufficient
  • There's no route in the graph to the next trampoline node: unknown_next_peer
  • There's a route in the graph to the next trampoline node, but I don't have enough balance: temporary_node_failure
  • There's a route in the graph to the next trampoline node, but it fails downstream: depending on the downstream failure, either directly relay that failure or replace it with a temporary_node_failure

Of course, trampoline nodes may also use other errors such as amount_below_minimum (or other Bolt 4 errors) in specific cases where that makes sense. Does that sound good?

The previous proposal had a node_update gossip message. That message is not equivalently replaced by the trampoline_hints in the invoice. If the sender decides to use an intermediate trampoline not in the invoice, they need to find out the fee/cltv parameters for that intermediate trampoline.

I decided to remove that mechanism because it wasn't working well: it's impossible to correctly estimate fees for all possible payments. It will force clients to overpay too often, or will still need a trial-and-error when you try to minimize the overpayment.

I'm now more in favour of the trial-and-error approach, where trampoline_fee_expiry_insufficient will return the fees/cltv that should will work for this specific payment. This guarantees that the retry should work if the trampoline correctly ran his path-finding algorithm, and used a small error buffer just in case (which minimizes sender-side retries). Some trampoline nodes may choose to return a value that would work for other payments as well (and maybe overpay this one), this is an implementation choice (and is a trade-off routing nodes should consciously choose to make or not). Clients may cache this value and use it for other payments as well, which may work (or may not, in which case the trial-and-error will let them correct this).

I think this approach is the most flexible one: we should leave some room for implementations to make different choices, to ensure we have a diverse network.

t-bast avatar Jan 08 '21 16:01 t-bast

Yes, I think it's better to introduce a new one, tailored for trampoline. But that is of course open to discussion if there's a better way.

In practice, wallets will be compelled to include both trampoline_hints and routing_hints in their invoices, because they do not know if the sender understands trampoline. For an invoice with a reasonably sized description and a single routing hint, this results in a 10% size increase.

I guess the data in trampoline_hints will most of the time be redundant with what is in routing_hints (the payee is going to infer the trampoline fee/cltv from the routes it knows). Are there use cases where that would not be redundant?

ecdsa avatar Jan 22 '21 11:01 ecdsa

In practice, wallets will be compelled to include both trampoline_hints and routing_hints in their invoices, because they do not know if the sender understands trampoline.

Yes, that's true. And the transition period during which both will be required will likely last months (hopefully not years though as non-trampoline wallets will have an incentive to migrate to the new routing hints, even if they don't want to support trampoline).

I think these new routing hints will solve a few issues that the current routing hints have, so people would upgrade to use them regardless of trampoline, and once that's done the size overhead will be somewhat negligible.

Are there use cases where that would not be redundant?

They will be different for cases where current routing_hints include more than one hop, but in practice I don't think anyone is using this (yet).

t-bast avatar Jan 22 '21 11:01 t-bast

It seems to me like this proposal should be split into two: one that introduces just the trampoline construction and forwarding logic, and a second one that allows recipients to specify trampolines for incoming payments. This is because in my eyes the latter is a new requirement, that deviates from the original trampoline proposal, and I'd like to unbundle the two.

It's also important to note that to receive trampoline payments, the recipient still needs to upgrade his software to support trampoline onion decryption, so it's a good opportunity to implement a new routing hint format at the same time.

This looks like a new requirement to me, since the last trampoline could be signaled to pay the final destination in non-trampoline mode, thus not requiring the recipient to even understand the trampoline protocol. It's the difference between "some node must support trampolines" and "this specific recipient needs to support trampolines". This is particularly important given that trampolines are first and foremost a tool for the sender, and requiring the recipient to play nice for the sender is an extra requirement.

I think we want to introduce a different, more flexible invoice routing hint. It's probably a good opportunity, as you mention, to introduce the tlv format in these routing hints v2. I know @rustyrussell mentioned it multiple times in the past, do you have some requirements / early designs for these new routing hints?

We can totally do that as well, however I think upgrading the route-hints to be TLVs with all the mentioned improvements is orthogonal to trampolines, hence my objection.

Bob could send anonymous onion messages to T1 and T2 to create short-lived circuits (for the duration of the invoice validity) to get them to route payments in the right direction, without revealing more than the next hop. He then only needs to specify T1 and T2 in his routing hints, information about Carol, Dave or concrete channels is not needed.

Is this not orthogonal to the original trampoline proposal? I don't see why we need to mix it in here, especially since with route blinding there is already a competing proposal that achieves pretty much the same.

If it does that, it's a bug. It should only do that for small payments to phoenix wallets that don't have enough incoming liquidity (and thus need a new channel opened on-the-fly). For other cases it should return unknown_next_peer. It's definitely possible that this case isn't currently handled correctly since only single-trampoline has been implemented.

Reusing an existing error code seems wrong to me here. unknown_next_peer implicitly assumes the sender knows what the next peer should be, which in this case it doesn't. It's not like we're running out of error number, so why not add a new unknown_next_trampoline error?


Overall I think two smaller proposals with a single feature being added are likely to make progress much quicker than bundling them up. The larger the surface the more likely someone will object (in this case that's me 😉)

cdecker avatar Jan 22 '21 18:01 cdecker

It seems to me like this proposal should be split into two: one that introduces just the trampoline construction and forwarding logic, and a second one that allows recipients to specify trampolines for incoming payments.

Sounds good, I'll keep the current PR open and up-to-date for people who want to see the whole thing, and will open a smaller PR in spec format for the onion construction.

the last trampoline could be signaled to pay the final destination in non-trampoline mode, thus not requiring the recipient to even understand the trampoline protocol

I don't see any satisfactory way of doing that... All the solutions I tried involve giving that last trampoline invoice information that force you to trust that this trampoline node will not cheat you, whereas it could steal money (payment_secret for amountless invoices for example).

That's why I want to avoid having that scenario in the spec and prefer doing E2E trampoline. For mobile wallets that can't fallback on normal payments when the invoice doesn't have trampoline support, then a trusted solution where you give parts of the invoice to a trampoline node may be implemented (that's what we do in Phoenix), but I don't think it should be in the spec because it's unsatisfactory.

If you find a satisfying way of achieving that I'm interested, but I don't think it's possible (it will sacrifice privacy and in some cases funds safety).

upgrading the route-hints to be TLVs with all the mentioned improvements is orthogonal to trampolines

That's a reasonable objection, I'll make a separate PR.

Is this not orthogonal to the original trampoline proposal? I don't see why we need to mix it in here, especially since with route blinding there is already a competing proposal that achieves pretty much the same.

I mentioned this simply as an example to show that it allows a larger design space than we previously had. And because circuit-based approaches have had more research put into them than route blinding, I would feel better having the possibility of studying multiple approaches in case route blinding turns out to be insufficient or not secure enough...

It's not like we're running out of error number, so why not add a new unknown_next_trampoline error?

Sure, we can do that, it's more explicit that way.

t-bast avatar Jan 25 '21 09:01 t-bast

It's not like we're running out of error number, so why not add a new unknown_next_trampoline error?

Like I said above, it would be useful to have distinct error numbers for unknown_next_trampoline and next_trampoline_unreachable (in case the trampoline fails to find a path, or fails to forward the payment).

In the first case, Trampoline A does not know about Trampoline B, and the sender should not ask TA to forward to TB again. In the second case, the sender will know that TA failed to forward a certain amount to TB, but it may be able to forward a different amount right now, or the same amount later.

Perhaps even more granularity is desirable. If TA fails to find a path to TB, it might return the "capacity" it believes it can send to TB in the error message.

ecdsa avatar Jan 25 '21 11:01 ecdsa

I integrated all these requirements in the trampoline onion PR, let's discuss it there: https://github.com/lightningnetwork/lightning-rfc/pull/836.

t-bast avatar Jan 25 '21 13:01 t-bast

Very cool proposal! Still grokking, but had a qq. From offline discussions, it sounds like trampoline will play a key role in async payments (the sender case). Is the plan to adapt this proposal to add some way for the sender to indicate to the trampoline node that they should retry the payment on their behalf? Maybe it already works as-is?

valentinewallace avatar Sep 29 '22 18:09 valentinewallace

Is the plan to adapt this proposal to add some way for the sender to indicate to the trampoline node that they should retry the payment on their behalf? Maybe it already works as-is?

Great timing, we actually just merged a PR to eclair to experiment with that mechanism: https://github.com/ACINQ/eclair/pull/2435 We are using an experimental tlv for now that the sender puts in the trampoline onion for the first trampoline node. This tlv is currently empty (it's just a signal), but we may enrich it with data if we find useful things to stuff in there.

t-bast avatar Sep 30 '22 12:09 t-bast