eclair icon indicating copy to clipboard operation
eclair copied to clipboard

Improve router performance

Open rorp opened this issue 2 years ago • 2 comments

This PR parallelizes route calculations. It creates a bunch of worker actors that perform route calculation in parallel. As a side effect it prevents the Router's mailbox from clogging with route requests, so it can process network updates on time.

See https://github.com/ACINQ/eclair/discussions/2644 for the context.

rorp avatar May 08 '23 16:05 rorp

Codecov Report

Attention: 5 lines in your changes are missing coverage. Please review.

Comparison is base (c37df26) 85.85% compared to head (86ffd1a) 85.92%. Report is 1 commits behind head on master.

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2651      +/-   ##
==========================================
+ Coverage   85.85%   85.92%   +0.07%     
==========================================
  Files         216      216              
  Lines       18201    18221      +20     
  Branches      749      789      +40     
==========================================
+ Hits        15626    15656      +30     
+ Misses       2575     2565      -10     
Files Coverage Δ
...re/src/main/scala/fr/acinq/eclair/NodeParams.scala 93.41% <100.00%> (+0.02%) :arrow_up:
...src/main/scala/fr/acinq/eclair/crypto/Sphinx.scala 100.00% <100.00%> (ø)
.../scala/fr/acinq/eclair/payment/PaymentPacket.scala 90.90% <100.00%> (+0.08%) :arrow_up:
...scala/fr/acinq/eclair/payment/send/Recipient.scala 98.52% <100.00%> (ø)
...cinq/eclair/remote/EclairInternalsSerializer.scala 97.82% <100.00%> (ø)
...a/fr/acinq/eclair/wire/protocol/OnionRouting.scala 80.00% <100.00%> (+13.33%) :arrow_up:
...a/fr/acinq/eclair/wire/protocol/PaymentOnion.scala 99.27% <100.00%> (-0.02%) :arrow_down:
...src/main/scala/fr/acinq/eclair/router/Router.scala 94.42% <95.34%> (+0.02%) :arrow_up:
...cala/fr/acinq/eclair/router/RouteCalculation.scala 94.50% <70.00%> (-0.06%) :arrow_down:

... and 8 files with indirect coverage changes

codecov-commenter avatar May 09 '23 17:05 codecov-commenter

I think this is a useful change, but I dislike the volatile wrapper.

That’s a simple way to avoid Memory Consistency Errors in JVM. And it does the job, since all updates are made in one thread.

IMHO the real overkill is that Router unnecessarily implements FSM, whereas it’s not a finite state machine at all, there’s only one possible state NORMAL, so there are no state transitions, naturally.

My goal was to make the least invasive change, so I introduced stayUsing() method, which is in fact dangerous, because someone could unknowingly use stay() using ??? instead, and screw up the workers state.

I’d prefer to drop FSM support for Router altogether, and to make it a plain vanilla actor, drop any kind of stay() method calls, and stick to persistent data structures. In this case I would do something like this:

class Router(val nodeParams: NodeParams, watcher: typed.ActorRef[ZmqWatcher.Command], initialized: Option[Promise[Done]] = None) extends Actor {

  private val data = VolatileRouterData()
 
  def receive =
    …

    case PeerRoutingMessage(peerConnection, remoteNodeId, c: ChannelAnnouncement) =>
      data.update(
 = Validation.handleChannelAnnouncement(data, watcher, RemoteGossip(peerConnection, remoteNodeId), c))

    …

    case r: RouteRequest =>
      worker ! Worker.FindRoutes(data, r, sender())

    …
  }
}

It’s not really Haskell style, but easy to understand, easy to maintain, and there’s absolutely no Java concurrent programming fun.

We should also probably motivate this change with benchmarks, like any non-trivial change made solely for performance gains.

You can find some benchmarks here https://github.com/ACINQ/eclair/discussions/2644

rorp avatar Jan 21 '24 19:01 rorp