ilp-connector icon indicating copy to clipboard operation
ilp-connector copied to clipboard

think about when to send routes without curves

Open michielbdejong opened this issue 8 years ago • 43 comments

now that we have remote quoting, it can make sense for connectors to sometimes send out routes without curves. the current network is probably still small enough not to need this, but it may be something to think about longer-term

michielbdejong avatar May 10 '17 17:05 michielbdejong

probably, actually, currently all routes are broadcasted with curves, and these curves are always used to determine which route to use, but never to determine what quote to give. So that's a slightly weird situation, I think. Each ilp-kit has local copies of all the curves, but doesn't use them for quoting.

michielbdejong avatar May 10 '17 21:05 michielbdejong

Indeed, it is a weird (and I hope very temporary) situation. I'm not sure what small steps we can take to greatly improve things; I think a qualitative jump in routing features/quality (as opposed to bug-fixing and optimization) will require a pretty major redesign.

momerath42 avatar May 10 '17 21:05 momerath42

what if we just disable remote quoting?

michielbdejong avatar May 10 '17 22:05 michielbdejong

/me ducks ;)

michielbdejong avatar May 10 '17 22:05 michielbdejong

Each ilp-kit has local copies of all the curves, but doesn't use them for quoting.

Is this true? I thought it does use the local liquidity curve for quoting when it has it and only does a remote quote when it does not have the curve.

Even if we are not currently using remote quoting because everyone is broadcasting their curves to one another, I don't see any reason to take this functionality out. We know it will be useful if someone wants a more complicated fee structure that they don't want to broadcast to the entire network. If this isn't causing problems now I don't think we should spend time discussing it.

emschwartz avatar May 11 '17 07:05 emschwartz

Is this true? I thought it does use the local liquidity curve for quoting

let me have a look at the code, that's the only way to know for sure, I guess :)

michielbdejong avatar May 11 '17 09:05 michielbdejong

RouteBuilder#getQuote uses Ledgers#quote uses ilp-core quote, which is remote.

After that, slippage is added and the curve from the remote quote response is added to the connector's "own" quote response, but it doesn't seem to look at its locally cached curves at all in the getQuote method.

It was changed in this commit, 9 months ago, you can see the 'find next hop' code was removed there. I added a question about the title of that PR, as it seems misleading to me (but maybe I misunderstood what the code does!).

Remote quoting can be used in 3 situations:

  1. source-side, where a "default route" is configured. This just makes the connector act as a sender for remote routes, so that's an easy case which doesn't affect route quality or quote quality, and although it's not currently used on by any live ilp-kit nodes, you could easily imagine someone would want to try this out.
  2. destination-side, broadcast routes without curves. This is currently neither supported for broadcast-sending, nor for broadcast-receiving. Also, we probably don't need this yet since each ilp-kit only have one local FiveBells ledger, and not many complex subledgers. If we do decide to start using this (hence the title of this issue), then we should think about the impact of that - maybe we need some other measure to compare which one of two curve-less routes is really cheapest.
  3. to cover up errors in the curves that were cached from route broadcasts. If the locally cached curve is wrong, then doing a remote quote will cover up the errors in it. IMHO if this is really what we are currently doing (whether by design or due to a misconfiguration), then we should stop doing it.

michielbdejong avatar May 11 '17 09:05 michielbdejong

I think I understand why it works this way. It assumes that you're sending to an address on the periphery (illustrated in the diagram below). This model assumes that all nodes in the core have all the routes to everyone else in the core. This means you need to get a "tail quote" for the leg from the far end of the core to the destination account. It joins the tail curve with the local one and it looks like it uses the first hop from the local table to figure out the source amount for the quote.

image

Right now the ILP kit network is all core, no periphery, because all nodes broadcast all routes to one another and save everyone else's routes. Not sure though what this means about whether in practice the quotes are determined by the local tables or the remote quote.

emschwartz avatar May 11 '17 10:05 emschwartz

It doesn't sound to me like remote quoting is the problem. It sounds like it may be using remote quoting when it shouldn't or doesn't need to. The ability to do remote quoting should not be disabled but we should make sure it is being used in the correct circumstances.

emschwartz avatar May 11 '17 10:05 emschwartz

sorry, by 'disable', I did not mean 'remove', I just meant 'use only in the correct circumstances' (so only 1. and 2. and not 3.)

michielbdejong avatar May 11 '17 10:05 michielbdejong

It doesn't sound to me like remote quoting is the problem. It sounds like it may be using remote quoting when it shouldn't or doesn't need to. The ability to do remote quoting should not be disabled but we should make sure it is being used in the correct circumstances.

I think it's a very good idea for now to have it always use remote quoting. By keeping routing and quoting separate, the system becomes a good bit simpler and more robust. If they are separate, you can't get a quoting failure because we changed something about routing and vice versa.

Right now, we often get an error that there is no path found at all. So conceptually we should at least aim to design a system that will find some valid, resonably short path if one exists.

From that baseline, we can add optimizations:

  • The first optimization is to think about how we can prefer cheaper paths over more expensive ones (in circumstances where doing so is safe.)

    How important this optimization is depends largely on whether there is enough of an incentive to have low fees otherwise. It's plausible that the choices people make while peering and when manually configuring routing tables could be good enough to create competition.

    Due to the difficulty we've had getting the current routing protocol working reliably, we should consider a simplified protocol that does not broadcast liquidity curves at all, but rather routes purely on a shortest-path basis. We would still want to ensure that connections are removed from the graph if they have run dry, but this can be a local connector function: If liquidity < $10, stop advertising this route.

  • Another optimization is to use information that has already been exchanged via the routing protocol to skip quoting or improve quoting latency.

    I think this is not something we should attempt until the routing protocol is pretty stable and fleshed out. If quoting expects information in a certain format (e.g. a liquidity curve), it puts constraints on our design of routing which in turn makes this harder to iterate on. Note that quoting is supposed to be the more stable of the two, so any dependency of quoting on routing is going to be a PITA to maintain as we iterate routing.

    Finally, it is not clear to me that this optimization is a good idea at all. In IP, there is something quite similar to quoting: Path MTU Discovery The MTU is a property of a path between two endpoints that the sender might wish to discover, just like an ILP sender would want to discover the liquidity curve of a given path. The IETF could have tried to include MTUs in routing tables, but this would have created the exact nasty dependency I've outlined in the previous paragraph. Interestingly, even when they had a chance with IPv6 to redesign it and move it into routing (as an optimization), they went the opposite direction and made it a fully end-to-end functionality. This isn't the most efficient way of doing it, but the simplest and most general.

    It's plausible to me that this is simply not worth it as an optimization. Note that caching quotes would have nearly the same performance benefit at a fraction of the complexity cost. If my users are making enough payments to a certain destination ledger that it is worth it for me to have the liquidity curve in memory, there will also be enough payments to that destination that most quote requests are cache hits. As a connector, I can even decide to actively keep the cache warm. Doing it that way creates no dependency on routing and no dependency on IPR etc. (like tail quotes would have) and my intuition is it gives you 99.9% of the performance benefit.

In summary:

  • Quoting should NOT use liquidity curves from the routing tables for the foreseeable future because we don't want to create a dependency between quoting and routing.
  • We should consider a simplified protocol that does not take liquidity curves into account at all. (Credit to @momerath42 and @michielbdejong who I believe suggested this before I did.)

justmoon avatar May 11 '17 12:05 justmoon

I like the separation in IP between what are sometimes called the routing and routed protocols. i.e. Management protocols between nodes as opposed to the protocols responsible for actually delivering traffic (or in the case of ILP, value). Exchanging of route data is a routing protocol whereas quoting is not really, it is something new (like doing a traceroute performed before sending a packet) that probably qualifies more as a routed protocol.

Perhaps its worth differentiating between the protocols and the functions of the nodes/gateways. Because quoting and routing are both functions performed by a node but those functions could be achieved in a number of ways.

When a node receives an incoming transfer it must route it, which it does based on its routing tables. How the data got into the routing table is a separate concern. As @michielbdejong and I discussed the other day, IP has numerous routing protocols and I expect we'll end up having more than 1 too. Some may have optimizations like attaching liquidity curves to the routing data but those should not be seen as a core requirement of a routing protocol. Perhaps we should be wary of labelling what we come up with today as "The Routing Protocol"?

Likewise, when a node receives an incoming quote request it must calculate a response either by first routing the request down one or more routes to other connectors and then adding a spread to the response amount or by using some internal heuristics (like cached liquidity curves). Again, the protocol for exchanging the data that might be used as part of this function is a new protocol(s), or an extension to another connector-to-connector (routing) protocol.

So, with regards to this, and looking at things purely from the perspective of defining the protocol specifications:

Quoting should NOT use liquidity curves from the routing tables for the foreseeable future because we don't want to create a dependency between quoting and routing.

I don't see using liquidity curves as a protocol feature but more of an implementation optimization. The core protocol requires that nodes can request a quote from their peers and we shouldn't make it more complex than that.

The protocol only says that a node that gets a quote request must return a response. It doesn't say how it gets that response so if nodes find that they can effectively respond to quote requests based on cached liquidity data then that will drive the development of a mechanism to exchange that data.

adrianhopebailie avatar May 11 '17 13:05 adrianhopebailie

Curves in route broadcasts is what we currently have, and it's a direct result of the "complexity is in the connectors" design choice. We should only switch to curve-less route broadcasts it if we think we're on a dead end with the current design.

In the current network topology of the open Interledger, it's very easy to let each node hold all the routing tables including curves. So I think we should stick with curves in route broadcasts.

Advantages of keeping curves in route broadcasts:

  • less work, because it's the code we have
  • if we keep refactoring all the time, bugs will not go away, they will be replaced by new ones
  • curve-less route broadcasts may be more efficient for announcing exotic ledgers in bulk, but remote quoting adds a cost at the time that the user has already clicked the button and is waiting; better keep the network in an always-ready-for-action state
  • algorithmically, building all multi-hop routes for a network is less work if you build them all at the same time; when new information becomes available in one node, it is wise to forward this information immediately, because it will save other nodes work if they are better informed, sooner.

michielbdejong avatar May 11 '17 14:05 michielbdejong

We should only switch to curve-less route broadcasts it if we think we're on a dead end with the current design.

In terms of the ILP Kit design that's fine, your points are valid.

In terms of defining and documenting routing protocols I'd favor documenting a protocol that excludes them and is very simple and efficient.

adrianhopebailie avatar May 11 '17 15:05 adrianhopebailie

I've been thinking of the curves-in-route-broadcast as a vestigial organ, which would be painful to remove, out of proportion to the gain of removal. Regardless of whether we want to remove it from the implementation in the near future, I agree with @adrianhopebailie that the specification shouldn't include it, but leave the question of how a connector maintains and makes use of its routing table to implementations and/or later specifications.

If we do want to remove them from the implementation, for clarity and (eventual) robustness gains, the first thing we should do is replace the use of them in choosing who to get a quote from. That component could range from dead-simple-and-not-really-any-better-than-what-we-have to ensemble-of-ML-algos-produced-over-man-years. It's not clear to me where the knee of that curve is.

momerath42 avatar May 11 '17 16:05 momerath42

Curves in route broadcasts is what we currently have, and it's a direct result of the "complexity is in the connectors" design choice. We should only switch to curve-less route broadcasts it if we think we're on a dead end with the current design.

Yes, that's my understanding as well. Ideally, we would fix the routing protocol with minimal changes. Right now I'd give it about a 60% probability that we can make that work. We know the current least-cost routing will not work, because a single bad node could blackhole the entire network's traffic by publishing a single unrealistically good rate. At a minimum we have to change it to shortest-least-cost routing, i.e. out of the set of shortest (fewest AS hops) paths, which one has the least cost? But if that works out, we would still have liquidity curves for non-aggregated routes.

But even if the routing protocol can provide liquidity curves locally, I'd still not want to use them for quoting. For a couple of reasons:

  • It puts additional constraints on the routing protocol - for example:
    • For routing, it might be fine if the liquidity curves are updated once every couple of hours. Sure, there may be a cheaper connector, but it may be more important to dampen route fluctuations. But for quoting, liquidity curves have to be up-to-date.
    • For routing, we don't have to send an updated liquidity curve at all if we don't think that it will affect the paths taken. For quoting, we do.
    • For routing, curves don't need to be nearly as accurate as for quoting, so we might be able to get away with far more compressed curves (or even simple ratios) if we don't use the curves for quoting also.
    • For routing, we don't necessarily need to keep the curves in memory. If we don't expect to receive a competing curve to compare to the one we currently use (e.g. because there is only one path for a given destination), then we may be able to discard them as soon as we've calculated and broadcast the effects on our routing table. (Not saying we should do this, just giving examples to illustrate that adding quoting as a use case impacts many decisions regarding the routing protocol.)
  • What is a ledger for routing purposes and what is a ledger for quoting purposes may not coincide. For instance, we were talking about using ledger prefixes to distinguish transaction types with different fee policies. Suppose we have a bank with three transaction types and three different fee levels. During quoting, it will respond with different liquidity curves for each fee type and set appliesToPrefix accordingly. However, for routing, these distinctions are irrelevant - all payments go through the bank's connector, so the bank would likely just broadcast its entire prefix with a liquidity curve representing its currency, which does not contain those fees.
  • If we make providing curves to the quoting system a use case for our routing protocol, we have to consider quoting in all discussions about the routing protocol. Designing the routing protocol is difficult enough. If we can avoid bringing in the cross-cutting concern of using the same curves for quoting, we should do so.
  • If there is a connection between the two, every change to the routing risks introducing a new bug by changing the data that the routing protocol supplies to the quoting code in some subtle way.
  • This optimization will become worthless once we start aggregating routes.
  • Caching quotes (possibly with active prefetching) is available as a more robust alternative.

justmoon avatar May 11 '17 19:05 justmoon

Yes, that's my understanding as well. Ideally, we would fix the routing protocol with minimal changes. Right now I'd give it about a 60% probability that we can make that work.

I'm unclear what you mean about this. From my perspective the routing part of things is already quite solid, if the goal is for every node to always have a route to any of the destinations that should be available in reality (I've tested every nonisomorphic topology of 7 or fewer connectors, and we're now 39% of the way through the 8 node graphs). My uncertainty about the achievement of this goal is in the realm of failure-modes caused by network connections coming up and going down with problematic timing.

The original goal of the implementation I hacked up (in a minimally-invasive way, hence the vestigial curves in route-broadcast messages) went beyond this, into multi-criteria pathfinding. And I think, ultimately, routes will be chosen by such a method (and support for it might be required as far down as route-broadcast). But as per our plan when I started hacking on the js implementation, routing now resembles distance-vector protocols, and quoting is (or is supposed to be) remote-only.

So, what's missing in the short term, from my perspective, is a better, simple route-selection mechanism, that doesn't rely on the broadcast curves (which we can then remove), and a quote-caching system (which might well be a requisite of a route-selection method).

momerath42 avatar May 11 '17 19:05 momerath42

The current situation is based on honesty: each connector broadcasts a curve, and a sender has to trust each connector along the path not to steal a little bit during the remote-quoting phase.

That needs to change, because we want the network to at the same time be open to unvetted participants, and to handle real money.

If routing and quoting are decoupled, then you need to trust each connector along the route; it will become very easy for connectors to always charge high connector fees.

Thus far, I think we agree, right?

Now, the question is, how do we improve the current situation. :)

My suggestion would be: use the liquidity curves from the routing broadcasts for quoting, and add peer metrics. That way, a connector that announces cheap routes will have to route payments at that cheap price, and will either lose money or have a high failure rate, meaning it will be expelled from the network.

Other metrics (apart from price and payment success rate stats from the peer metrics) which I think could be relevant for route selection would be latency, and maybe throughput (although that's probably hard to measure other than through success rate under high traffic).

I do agree that the current two-objective routing tables are not the simplest solution. As a simplification, to get a single-objective cost function, we could route based on "required source amount for sending 1 USD of destination amount within 10 seconds, with 95% success rate".

Another option would be to always remote-quote all possible paths, through the whole network, and not just over one path, but I think we agree that that would lead to a lot of unwanted message sending storms.

We can use # hops and split-horizon as heuristics to avoid loops, but IMHO if you use it as the cost function, connector fees in the network will no longer be subject to competition, and thus become very high.

michielbdejong avatar May 11 '17 21:05 michielbdejong

Reminder: the broadcast curves aren't updated, so you shouldn't trust them as a metric for route choice over long time periods anyway, even if everyone was being fair, but rates were changing.

I think we can do pretty well with efficiency and bad-actor tolerance, without having curves be part of routing, and without any algorithmic improvement over path-vector (or even distance-vector). This is coming from someone who'd love to spend all his time reading papers and trying to cobble the best ideas together into greenfield prototypes.

Connectors know when they have multiple next-hop options for a given destination. They don't currently know about downstream topology[1]. At startup, they receive all the quotes they might want, in a push fashion. Whether those quotes are updated on-demand (low-transaction volume and/or high volatility) or by a subscription mechanism, obviously there's a practical limit to how many {next-hop, destination} pairs we want to be keeping an up-to-date picture of (or waiting for a response from for a given payment in the low-volume case). To me, that speaks to the separation of liquidity-curve info from establishment of routing tables.

Without really digging in, I think it's best seen in reinforcement learning terms: trying to balance explore and exploit. In an established network, I would expect the accepted 'fair' exploitation of the system's properties to cross well into what would be disruptive behavior on the current network. More importantly, I don't see any 'ledge' in the slope between a network of trusted peers, and that eventual future. I don't know if there are any purely technical solutions, and I'm doubtful that even given natural incentives, and an acceptable level of loss for any given relationship (did I just define trust lines?), that there is a simple metric, which if it were widely adopted, wouldn't be subject to disruption-for-profit, followed by scrambling to adapt. So, I start thinking about ensemble systems involving PID loops and anomaly detection... and I never write up a simpler design.

  1. I'd argue for some transparency into that, along the lines of BGP, but for network stability reasons: it makes loop detection trivial.

momerath42 avatar May 11 '17 22:05 momerath42

In this network:

           1- C -1
         /         \
A --3-- B  ---3---  D

route ABD has cost 6 and route ABCD has cost 5. But you can only discover that if the cost function you want to optimize is used as the cost function for the routing algorithm. If you use path length as the routing criterion, ABD will look better than ABCD.

I think finding the best route in this case is important.

michielbdejong avatar May 11 '17 22:05 michielbdejong

I should have qualified 'distance-vector' - in the classic algo, you throw away routes of greater distance/cost. I'm suggesting that connectors maintain knowledge of the reachability of all destinations for all peers (which they do now). Which of those peers to choose for any given destination+amount, at any given time, would be solved by a separate system, which takes into account (possibly out of date at times) quoting data, as well as a reliability metric, and possibly other considerations.

momerath42 avatar May 12 '17 03:05 momerath42

Sorry, was trying to clarify further, decided I wasn't making sense, and clicked close, thinking it was cancel. Hopefully I'll be able to write up something more complete and lucid in the morning.

momerath42 avatar May 12 '17 05:05 momerath42

So what, concretely, would be the cost function of the distance-vector routing step? Just reachability of a destination? (cost of a route is 0 if reachable, infinity if not)?

And what would be the search algorithm for the quoting step? try out all combinations, or some heuristic for exploration/exploitation?

michielbdejong avatar May 12 '17 05:05 michielbdejong

It sounds like there are still unanswered questions about a) our medium to long-term quoting/routing strategy and b) how we should improve the implementation in the short term.

Clarifications (this is how I understand the following terms, I'm worried we may be using different definitions of the terms):

  • Route discovery - how connectors find out about available routes to a given destination.
  • Route selection - how connectors determine what next hop(s) they should use for a given destination prefix. Different connectors may use different algorithms, that take into account different metrics, but we need to agree on what to implement in the reference implementation.
  • Routing protocol - the data format connectors use to send one another information relevant for their route discovery and/or route selection algorithms. There is currently a debate about what information should be included in this.
  • Quoting algorithm - how connectors answer quote requests. Connectors may use information cached locally or remote quotes to other connectors to answer requests. Like the routing algorithm we need to agree on what to implement in the reference implementation.
  • Quoting protocol - how you ask a connector for the rate or liquidity curve to a specific destination prefix. This is something that must be standardized early and should be very simple.

I would summarize @michielbdejong's perspective as follows:

  • The route selection algorithm must involve the cost, because that is what users of it will care about and we cannot rely even on an honest connector to consistently maintain the cheapest path. Path MTU is not a good parallel because that does not change the route you want to take, only what fragmentation you use.
  • If we switched from "routing" to "quoting" with quote caching, we would end up needing push-based curve broadcasts because stale cache entries or paths that have gone bad or gotten too expensive would case failed payments. You would refresh your cache when the payments fail but it would result in a bad user experience.
  • If we have quote caching and then re-add a push-based update mechanism to allow connectors to proactively notify peers about liquidity or rate changes, we would effectively be back where we are now but we would have renamed "routing" to "quoting".
  • Right now our implementation has issues but those are the result of bugs that we should fix rather than an indication that the algorithm is fundamentally flawed.
  • The simplest path forward is to fix the current bugs (like https://github.com/interledgerjs/ilp-connector/issues/338) and answer quotes from the cached curves when they are available.

My understanding of @justmoon's and @momerath42's perspective is:

  • Routing and quoting in the ilp-connector are currently broken and the routing algorithm is fundamentally flawed.
  • Connectors should answer all quotes using remote quotes to other connectors. This is a fundamentally simpler starting point and we can add other optimizations later.
  • When network traffic is a problem connectors can cache liquidity curves from quotes and possibly implement active prefetching (using connector-initiated quote requests).
  • If the routing protocol includes liquidity curves and the routing algorithm is based on that only, as opposed to number of hops, we will run into accidental and malicious routing black holes.
  • Even if the routing protocol included liquidity curves, the quoting algorithm still should not use them.

My questions about @justmoon and @momerath42's perspective:

  • How do we know the current routing/quoting algorithm is fundamentally broken rather than the issues we're seeing being a result of bugs in the implementation?
  • How would you deal with significant changes in rates or liquidity that happen before caches are invalidated? Would this need push-based updates?
  • Do you expect connectors to ask for remote quotes from one or multiple connectors?
  • Would we cache liquidity curves in the same data structures we currently call routing tables?
  • Is there a substantive difference between the routing protocol including liquidity curves and the quoting protocol including push-based updates?

emschwartz avatar May 12 '17 12:05 emschwartz

Edited: Route discovery -> route selection below

Great summary @emschwartz!

I would add, and I think @justmoon made this point, that routing data and liquidity data have very different update frequencies. This is further motivation to use different protocols to exchange this information.

It's possible (without curves being exchanged at all) for a node to use a Routing protocol to determine which of it's peers give it access to which areas of the ILP address space. This doesn't mean it's able to do full Route selection yet, because as @michielbdejong points out, cost is an important dimension to consider in deciding where to actually send a payment.

But, with a routing table populated with just physical route data, a sender can then issue quote requests to all of the peers it considers worth asking for the cost to deliver a payment. This may become inefficient and there is a chance that cheaper routes are missed because they are filtered out before quoting but trying to push liquidity curves into the route data seems like a premature optimization.

Rather, with this simple model working we can introduce extensions to the quoting protocol that return curves so that a quote requestor is able (if it chooses) to cache data for future quotes.

TL;DR: Routing protocol should be optimized so that processing, aggregation and rebroadcasting route data can be optimized within connectors. Exchanging liquidity curves should be seen as an optimization of the quoting process, possibly introducing a new connector-to-connector protocol for this purpose.l

adrianhopebailie avatar May 12 '17 12:05 adrianhopebailie

I object to the characterization of our current Route discovery or Routing protocol as fundamentally broken. The behavior I intended it to have, when I was asked to make it work and scale well enough to do a 30 ledger demo, without it falling over, is in place and working reliably.

I agree that removing cost (and other) information from the route-discovery & routing-protocol layer does punt on the route-selection problem, and that a solution to it at the quoting layer, in some designs, starts looking like a distributed routing algorithm. But, my claim is that we don't have a distributed multiple-metric Route-selection design that would scale. The only designs I see working are heuristic, and whether quotes are push or pull is dependent on the workload of the connector and its peering relationships.

momerath42 avatar May 12 '17 18:05 momerath42

I object to the characterization of our current Route discovery or Routing protocol as fundamentally broken.

Well, the current route selection algorithm (as in, actual current master branch of this repo), from what I understand, can be characterized as 'gratuity-based': it's like choosing a restaurant that includes a gratuity in the bill. You choose the restaurant based on the prices listed on its menu, but the price you end up paying is essentially unrelated.

The current plan, from what I understand, can be characterized as 'cached quotes' route selection: we remove the liquidity curves from the route broadcasts, and we choose a cost function (for instance, number of hops) which is unrelated to monetary cost. Then, routes are not used to decide which path to take; instead, they are only used to decide along which path to quote, and cached quotes essentially become a renamed version of our current routing tables.

IMHO this plan will take a lot of refactor work, and not having push (broadcast) messages will mean information will travel slower and less efficiently through the network.

I don't like this plan. I think, instead, we should revert https://github.com/interledgerjs/ilp-connector/pull/206#issuecomment-300741626 and use the liquidity curves (which are already implemented) both for route selection and for quoting. That way, the 'gratuity' system is removed, and connectors can compete with each other using market forces.

Note: this is under the assumption that our reference connector can be used on open networks, and not only on closed network where network participants are vetted.

michielbdejong avatar May 13 '17 19:05 michielbdejong

From my perspective, the "current plan" includes a heuristic route selection mechanism, utilizing quotes from multiple peers (but not every peer every time). This does not mean choosing a cost function unrelated to monetary cost; if a peer's quote is expired, but we use it because they've been the cheapest and most reliable, and it turns out more expensive than expected, the heuristic will take that into account.

Keeping liquidity curves up to date for all possible routes is, I believe, requiring too much work of low-volume connectors, and limiting the scale at which high volume connectors can work (else they'll limit the number of peer,destination pairs they have to deal with).

momerath42 avatar May 14 '17 18:05 momerath42

Talking strictly about the open Interledger, let's try to come up with some numbers of how impossible it would be to maintain perfect routing info: say there are 20 nodes, and each node has 5 trustlines. Say each trustline changes three times a day - maybe once due to fixer.io exchange rate changes, and twice because balances changed so much that the liquidity cap had to be adjusted.

That means each node sends out only 3x20x5=300 messages a day, and receives only 3x20x5 messages a day - so each ilp-kit needs to send and receive roughly one message every 5 minutes? That's still a lot less than the current heartbeats every 30 seconds.

IMHO, let's just try to get ilp-kit working and do a 2.1 release on Wednesday. Then if new bugs or scalabilty problems show up, we'll roll with the punches?

michielbdejong avatar May 14 '17 19:05 michielbdejong

Why would there only be 20 nodes? I thought we were talking about something that people adopt. I'm also looking toward a competitive environment, where rates are changing much more frequently.

As far as a 2.1 release, what are you envisioning it including? You make it sound (to me) like we already have what you describe.

momerath42 avatar May 14 '17 19:05 momerath42