lnd icon indicating copy to clipboard operation
lnd copied to clipboard

Only log "insufficient bandwidth" if this applies to all channels to that peer

Open C-Otto opened this issue 3 years ago • 9 comments

Background

If my node is asked to forward N sats to peer X via channel C, lnd is free to pick any other channel it might have to peer X. As far as I understand the code (switch.go:handlePacketForward), all channels to X (not just C) are checked for eligability, and a random eligable channel is then used for the actual forward.

As part of this eligability check (implemented in link.go:CheckHtlcForward -> link.go:canSendHtlc) the following warning message may be logged:

"insufficient bandwidth to route htlc: %v is larger than %v"

This warning is logged even if only one out of many channels has insufficient bandwidth. As such, even for successful forwards or forwards that failed further "down the line", I get a warning. Note that the warning might appear even if no error/failure is returned to the previous hop!

As a node operator I'd like to concentrate on "real" issues, i.e. peers where I'd need to add liquidity in order to fix routing errors ("temporary channel failure") communicated back to the sender.

Your environment

  • lnd 0.14.0-beta-rc3
  • Linux server 5.10.0-8-amd64 #​1 SMP Debian 5.10.46-4 (2021-08-03) x86_64 GNU/Linux
  • bitcoind v22

Steps to reproduce

  1. Have many channels to peer X, only one with lots of local balance, the others rather empty
  2. Receive forwarding request towards X for amount that can only be satisfied by one of the channels

Expected behaviour

"insufficient bandwidth to route" warning is NOT logged.

Actual behaviour

Warning is logged for each but one of the channels.

Related to #5250

C-Otto avatar Nov 13 '21 22:11 C-Otto

@C-Otto I'm assuming you'd want all types of temporary channel failures to be logged in this way? Eg. "outgoing htlc is too large" as well?

Also I'm assuming you'd still want the same amount of detail in the log as there currently is.

greywolve avatar Nov 17 '21 11:11 greywolve

Yeah, that'd make sense. As a node operator I mainly care about the issues I see, though, and I don't think I've ever seen a "htlc too large". That might be node-specific, though. And, yes, details are always nice! :) I need to know when it failed, for which channel/peer, and how the balances looked like at the time of the failure (so that I can see how much I might need to add in outbound liquidity).

C-Otto avatar Nov 17 '21 11:11 C-Otto

I am interested in picking this up as a first issue to contribute with. From my initial read of the code mentioned, it seems like the best way to handle this is to lower the log level on these messages to Info or Debug. Please let me know if there are other changes that you're looking for. All of the same errors would still be collected, just logged at a lower level.

SachinMeier avatar Jul 16 '22 08:07 SachinMeier

I'd still like to see the log message, just limited to the case that the amount isn't available in any channel to the peer.

C-Otto avatar Jul 16 '22 08:07 C-Otto

This makes sense. If the payment fails, the messages will be exposed as errors, but in the case that it does not, I'm going to drop these messages down to Info.

SachinMeier avatar Jul 16 '22 20:07 SachinMeier

Great. Please don't use the error log level, though. It's not an error, it's rather common and expected (although something users should try to avoid).

FYI: Here's a patch I'm using myself, maybe it helps:

Author: Carsten Otto <[email protected]>
Date:   Sun Dec 5 11:55:53 2021 +0100

    log "insufficient bandwidth" only if this is the failure reason for all channels to that peer

diff --git a/htlcswitch/link.go b/htlcswitch/link.go
index 626cfb67f..f7b87a239 100644
--- a/htlcswitch/link.go
+++ b/htlcswitch/link.go
@@ -2551,8 +2551,6 @@ func (l *channelLink) canSendHtlc(policy ForwardingPolicy,
 
        // Check to see if there is enough balance in this channel.
        if amt > l.Bandwidth() {
-               l.log.Warnf("insufficient bandwidth to route htlc: %v is "+
-                       "larger than %v", amt, l.Bandwidth())
                failure := l.createFailureWithUpdate(
                        func(upd *lnwire.ChannelUpdate) lnwire.FailureMessage {
                                return lnwire.NewTemporaryChannelFailure(upd)
diff --git a/htlcswitch/switch.go b/htlcswitch/switch.go
index b6e3e2537..e2c34b9ea 100644
--- a/htlcswitch/switch.go
+++ b/htlcswitch/switch.go
@@ -1132,6 +1132,20 @@ func (s *Switch) handlePacketForward(packet *htlcPacket) error {
                                htlc.PaymentHash[:], packet.outgoingChanID,
                                linkErr)
 
+                       if linkErr.FailureDetail == OutgoingFailureInsufficientBalance {
+                               var linkWithMaxBandwidth = interfaceLinks[0]
+                               var maxBandwidth = linkWithMaxBandwidth.Bandwidth()
+                               for _, link := range interfaceLinks {
+                                       if link.Bandwidth() > maxBandwidth {
+                                               linkWithMaxBandwidth = link
+                                               maxBandwidth = link.Bandwidth()
+                                       }
+                               }
+                               var logPrefix = fmt.Sprintf("ChannelLink(%v)", linkWithMaxBandwidth.ChannelPoint())
+                               log.Warnf("%v: insufficient bandwidth to route htlc: %v is "+
+                                       "larger than %v (for all channels)", logPrefix, packet.amount, maxBandwidth)
+                       }
+
                        return s.failAddPacket(packet, linkErr)
                }

C-Otto avatar Jul 16 '22 20:07 C-Otto

yea, i'm going to leave it as a Warning. And thank you!

SachinMeier avatar Jul 16 '22 20:07 SachinMeier

I'm also going to add logic for properly logging the other error types emitted at this same stage. It seems short-sighted to add only the maxBandwidth one.

SachinMeier avatar Jul 18 '22 17:07 SachinMeier

@C-Otto I opened a PR to address this. It is very similar to your patch. However, upon thinking through some of the logic, I realized that a channel can be ineligible for an HTLC for many reasons. Thus, it's not always accurate to find the max bandwidth of all channels. In the case that a channel with the most local balance had an abnormally short CLTV expiry, or abnormal HTLC min/max values, this logging could emit "insufficient balance, x > y" where y is actually greater than x, but the channel with y sats of local balance was ineligible for other reasons.

For similar reasons, I also decided not to add logic to log the other error types similarly. the linkErr that currently decides what message is returned is arbitrary. It's not the linkErr that is most immediately causing the HTLC to fail, but rather the linkErr of the link that the sender chose when creating the route (IIUC),

SachinMeier avatar Jul 22 '22 07:07 SachinMeier