lnd icon indicating copy to clipboard operation
lnd copied to clipboard

[bug]: [ERR] NANN: Unable to retrieve chan status for Channel -s since update to v0.20.0 on postgres

Open openoms opened this issue 4 weeks ago • 15 comments

Pre-Submission Checklist

  • [x] I have searched the existing issues and believe this is a new bug.
  • [x] I am not asking a question about how to use lnd, but reporting a bug (otherwise open a discussion).

LND Version

v0.20.0-beta

LND Configuration

db.use-native-sql=true bitcoind.rpcpass=passxxxx db.backend=postgres db.postgres.dsn=postgres://user:[email protected]:5432/lnd db.postgres.timeout=0 db.postgres.maxconnections=50 restlisten=0.0.0.0:8080 rpclisten=0.0.0.0:10009 listen=0.0.0.0:9735 prometheus.listen=0.0.0.0:9092 bitcoin.node=bitcoind bitcoind.rpcuser=rpcuser bitcoind.zmqpubrawblock=tcp://bitcoind:28332 bitcoind.zmqpubrawtx=tcp://bitcoind:28333 tlsextradomain=lnd accept-keysend=1 allow-circular-route=1 stagger-initial-reconnect=1 protocol.wumbo-channels=1 maxchansize=500000000 bitcoin.timelockdelta=60 default-remote-max-htlcs=50 debuglevel=info prometheus.enable=1 gc-canceled-invoices-on-the-fly=true gc-canceled-invoices-on-startup=true tor.active=true tor.v3=true tor.skip-proxy-for-clearnet-targets=true bitcoind.rpchost=bitcoind:8332 bitcoin.basefee=0 bitcoin.feerate=1000 max-commit-fee-rate-anchors=100 bitcoind.zmqpubrawblock=tcp://bitcoind:28332 bitcoind.zmqpubrawtx=tcp://bitcoind:28333 alias=xxxx minchansize=10000000 bitcoin.mainnet=true

Backend Version

Bitcoin Core 28.0

Backend Configuration

debug=mempool debug=rpc shrinkdebugfile=1 server=1 txindex=1 printtoconsole=1 rpcuser=xxx rpcpassword=xxx zmqpubrawtx=tcp://0.0.0.0:28333 zmqpubrawblock=tcp://0.0.0.0:28332 blockfilterindex=1 dbcache=2048 maxmempool=512 rpcworkqueue=128 mainnet=1

OS/Distribution

lightninglabs lnd container in K8s

Bug Details & Steps to Reproduce

Found out when bos peers stopped working with

edge not found

errors after the update to LND v0.20.0-beta.

lncli listchannels show the channels as active.

Expected Behavior

Channels should show up.

Debug Information

2025-12-01 20:09:22.200 [ERR] NANN: Unable to retrieve chan status for Channel(196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:1): could not fetch channel: edge not found
2025-12-01 20:09:22.202 [ERR] NANN: Unable to retrieve chan status for Channel(6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:0): could not fetch channel: edge not found
2025-12-01 20:09:22.204 [ERR] NANN: Unable to retrieve chan status for Channel(196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:2): could not fetch channel: edge not found
2025-12-01 20:09:22.207 [ERR] NANN: Unable to retrieve chan status for Channel(af73bba2a00dcc6b5e6d22b6b6c59948c75cb880b7a1c5de5df0a8319b99e7b7:1): could not fetch channel: edge not found
2025-12-01 20:09:22.209 [ERR] NANN: Unable to retrieve chan status for Channel(6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:1): could not fetch channel: edge not found
2025-12-01 20:09:22.212 [ERR] NANN: Unable to retrieve chan status for Channel(196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:3): could not fetch channel: edge not found
2025-12-01 20:09:22.213 [ERR] NANN: Unable to retrieve chan status for Channel(870112b937e44d023e22cdca352daf908e9f4b149afdd4e48ef2638f1c5860a7:4): could not fetch channel: edge not found
2025-12-01 20:09:22.214 [ERR] NANN: Unable to retrieve chan status for Channel(a3131b0014ebc9184d3697facd50b4b3dfd68fccf1148cf0ebc833ed73c11dc6:1): could not fetch channel: edge not found
2025-12-01 20:09:22.215 [ERR] NANN: Unable to retrieve chan status for Channel(d77290ad97775b15f2f23e5cea49b785ccb172bc5608ef8f6913be6947da10c6:0): could not fetch channel: edge not found
2025-12-01 20:09:22.215 [ERR] NANN: Unable to retrieve chan status for Channel(6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:2): could not fetch channel: edge not found
2025-12-01 20:09:22.216 [ERR] NANN: Unable to retrieve chan status for Channel(69d5b6db12ff4177a49783c66785df6824fcad62b9639bd9a46acd053a42801c:1): could not fetch channel: edge not found
2025-12-01 20:10:22.284 [ERR] NANN: Unable to retrieve chan status for Channel(196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:1): could not fetch channel: edge not found
2025-12-01 20:10:22.287 [ERR] NANN: Unable to retrieve chan status for Channel(6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:0): could not fetch channel: edge not found
2025-12-01 20:10:22.290 [ERR] NANN: Unable to retrieve chan status for Channel(196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:2): could not fetch channel: edge not found
2025-12-01 20:10:22.293 [ERR] NANN: Unable to retrieve chan status for Channel(af73bba2a00dcc6b5e6d22b6b6c59948c75cb880b7a1c5de5df0a8319b99e7b7:1): could not fetch channel: edge not found
2025-12-01 20:10:22.296 [ERR] NANN: Unable to retrieve chan status for Channel(6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:1): could not fetch channel: edge not found
2025-12-01 20:10:22.300 [ERR] NANN: Unable to retrieve chan status for Channel(196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:3): could not fetch channel: edge not found
2025-12-01 20:10:22.301 [ERR] NANN: Unable to retrieve chan status for Channel(870112b937e44d023e22cdca352daf908e9f4b149afdd4e48ef2638f1c5860a7:4): could not fetch channel: edge not found
2025-12-01 20:10:22.303 [ERR] NANN: Unable to retrieve chan status for Channel(a3131b0014ebc9184d3697facd50b4b3dfd68fccf1148cf0ebc833ed73c11dc6:1): could not fetch channel: edge not found
2025-12-01 20:10:22.304 [ERR] NANN: Unable to retrieve chan status for Channel(d77290ad97775b15f2f23e5cea49b785ccb172bc5608ef8f6913be6947da10c6:0): could not fetch channel: edge not found
2025-12-01 20:10:22.305 [ERR] NANN: Unable to retrieve chan status for Channel(6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:2): could not fetch channel: edge not found
2025-12-01 20:10:22.306 [ERR] NANN: Unable to retrieve chan status for Channel(69d5b6db12ff4177a49783c66785df6824fcad62b9639bd9a46acd053a42801c:1): could not fetch channel: edge not found

Environment

k8s deployment with https://github.com/blinkbitcoin/charts/tree/main/charts/lnd

openoms avatar Dec 01 '25 22:12 openoms

have you tried a restart, we fixed a big in this area see: https://github.com/lightningnetwork/lnd/pull/10378

ziggie1984 avatar Dec 01 '25 22:12 ziggie1984

do you see the channels when you query for it via lncli getchaninfo ...

ziggie1984 avatar Dec 01 '25 22:12 ziggie1984

have you tried a restart, we fixed a big in this area see: #10378

yes, the node has been restarted multiple times with no change in this issue

do you see the channels when you query for it via lncli getchaninfo ...

It is similar for all the listed channels:

lncli getchaninfo --chan_point 196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:1
{
    "channel_id":  "1017489159958298625",
    "chan_point":  "196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:1",
    "last_update":  0,
    "node1_pub":  "021294fff596e497ad2902cd5f19673e9020953d90625d68c22e91b51a45c032d3",
    "node2_pub":  "02dfb4c1dd59216fa6a28d0f012e188516f63517db68c4e4b82c3af41343a05bc4",
    "capacity":  "15000000",
    "node1_policy":  null,
    "node2_policy":  null,
    "custom_records":  {},
    "auth_proof":  null
}

Got more errors when tried running for the missing edges listed in the logs:

# Channel 1 - capacity 15000000 sats → max_htlc 14850000000 msat
lncli updatechanpolicy --chan_point 196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:1 --create_missing_edge --base_fee_msat 0 --fee_rate_ppm 1000 --time_lock_delta 60 --min_htlc_msat 1000 --max_htlc_msat 14850000000

# Channel 2 - capacity 15000000 sats → max_htlc 14850000000 msat
lncli updatechanpolicy --chan_point 196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:2 --create_missing_edge --base_fee_msat 0 --fee_rate_ppm 1000 --time_lock_delta 60 --min_htlc_msat 1000 --max_htlc_msat 14850000000
logs | grep -i "edge\|policy\|error"

2025-12-01 22:39:19.532 [WRN] LCHN: Missing edge for active channel (196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:1) during policy update. Recreating edge with default policy.
2025-12-01 22:39:19.542 [ERR] LCHN: Attempt to add missing edge for channel (196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:1) errored with: expected the funding transaction script to be set
2025-12-01 22:39:19.727 [WRN] LCHN: Missing edge for active channel (196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:2) during policy update. Recreating edge with default policy.
2025-12-01 22:39:19.791 [ERR] LCHN: Attempt to add missing edge for channel (196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:2) errored with: expected the funding transaction script to be set
2025-12-01 22:39:19.963 [WRN] LCHN: Missing edge for active channel (196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:3) during policy update. Recreating edge with default policy.
2025-12-01 22:39:20.070 [ERR] LCHN: Attempt to add missing edge for channel (196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:3) errored with: expected the funding transaction script to be set
2025-12-01 22:39:20.176 [WRN] LCHN: Missing edge for active channel (6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:0) during policy update. Recreating edge with default policy.
2025-12-01 22:39:20.225 [ERR] LCHN: Attempt to add missing edge for channel (6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:0) errored with: expected the funding transaction script to be set
2025-12-01 22:39:20.351 [WRN] LCHN: Missing edge for active channel (6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:1) during policy update. Recreating edge with default policy.
2025-12-01 22:39:20.411 [ERR] LCHN: Attempt to add missing edge for channel (6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:1) errored with: expected the funding transaction script to be set
2025-12-01 22:39:20.564 [WRN] LCHN: Missing edge for active channel (6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:2) during policy update. Recreating edge with default policy.
2025-12-01 22:39:20.677 [ERR] LCHN: Attempt to add missing edge for channel (6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:2) errored with: expected the funding transaction script to be set
2025-12-01 22:39:20.789 [WRN] LCHN: Missing edge for active channel (af73bba2a00dcc6b5e6d22b6b6c59948c75cb880b7a1c5de5df0a8319b99e7b7:1) during policy update. Recreating edge with default policy.
2025-12-01 22:39:20.852 [ERR] LCHN: Attempt to add missing edge for channel (af73bba2a00dcc6b5e6d22b6b6c59948c75cb880b7a1c5de5df0a8319b99e7b7:1) errored with: expected the funding transaction script to be set
2025-12-01 22:39:21.003 [WRN] LCHN: Missing edge for active channel (870112b937e44d023e22cdca352daf908e9f4b149afdd4e48ef2638f1c5860a7:4) during policy update. Recreating edge with default policy.
2025-12-01 22:39:21.102 [ERR] LCHN: Attempt to add missing edge for channel (870112b937e44d023e22cdca352daf908e9f4b149afdd4e48ef2638f1c5860a7:4) errored with: expected the funding transaction script to be set
2025-12-01 22:39:21.253 [WRN] LCHN: Missing edge for active channel (a3131b0014ebc9184d3697facd50b4b3dfd68fccf1148cf0ebc833ed73c11dc6:1) during policy update. Recreating edge with default policy.
2025-12-01 22:39:21.345 [ERR] LCHN: Attempt to add missing edge for channel (a3131b0014ebc9184d3697facd50b4b3dfd68fccf1148cf0ebc833ed73c11dc6:1) errored with: expected the funding transaction script to be set
2025-12-01 22:39:21.505 [WRN] LCHN: Missing edge for active channel (d77290ad97775b15f2f23e5cea49b785ccb172bc5608ef8f6913be6947da10c6:0) during policy update. Recreating edge with default policy.
2025-12-01 22:39:21.610 [ERR] LCHN: Attempt to add missing edge for channel (d77290ad97775b15f2f23e5cea49b785ccb172bc5608ef8f6913be6947da10c6:0) errored with: expected the funding transaction script to be set
2025-12-01 22:39:22.209 [ERR] NANN: Unable to retrieve chan status for Channel(69d5b6db12ff4177a49783c66785df6824fcad62b9639bd9a46acd053a42801c:1): could not fetch channel: edge not found
2025-12-01 22:39:28.816 [ERR] HSWC: Unhandled error while reforwarding htlc settle/fail over htlcswitch: insufficient bandwidth to route htlc
2025-12-01 22:39:46.625 [ERR] LCHN: Got nil channel edge policy when updating a channel. Channel point: 196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:1
2025-12-01 22:39:46.680 [ERR] LCHN: Got nil channel edge policy when updating a channel. Channel point: 196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:2
2025-12-01 22:39:46.736 [ERR] LCHN: Got nil channel edge policy when updating a channel. Channel point: 196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:3
2025-12-01 22:39:46.791 [ERR] LCHN: Got nil channel edge policy when updating a channel. Channel point: 6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:0
2025-12-01 22:39:46.846 [ERR] LCHN: Got nil channel edge policy when updating a channel. Channel point: 6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:1
2025-12-01 22:39:46.900 [ERR] LCHN: Got nil channel edge policy when updating a channel. Channel point: 6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:2
2025-12-01 22:39:46.956 [ERR] LCHN: Got nil channel edge policy when updating a channel. Channel point: af73bba2a00dcc6b5e6d22b6b6c59948c75cb880b7a1c5de5df0a8319b99e7b7:1
2025-12-01 22:39:47.015 [ERR] LCHN: Got nil channel edge policy when updating a channel. Channel point: 870112b937e44d023e22cdca352daf908e9f4b149afdd4e48ef2638f1c5860a7:4
2025-12-01 22:39:47.072 [ERR] LCHN: Got nil channel edge policy when updating a channel. Channel point: a3131b0014ebc9184d3697facd50b4b3dfd68fccf1148cf0ebc833ed73c11dc6:1
2025-12-01 22:39:47.128 [ERR] LCHN: Got nil channel edge policy when updating a channel. Channel point: d77290ad97775b15f2f23e5cea49b785ccb172bc5608ef8f6913be6947da10c6:0
2025-12-01 22:39:47.451 [WRN] LCHN: Missing edge for active channel (69d5b6db12ff4177a49783c66785df6824fcad62b9639bd9a46acd053a42801c:1) during policy update. Recreating edge with default policy.
2025-12-01 22:39:47.624 [ERR] LCHN: Attempt to add missing edge for channel (69d5b6db12ff4177a49783c66785df6824fcad62b9639bd9a46acd053a42801c:1) errored with: expected the funding transaction script to be set
2025-12-01 22:40:26.367 [ERR] HSWC: Unhandled error while reforwarding htlc settle/fail over htlcswitch: insufficient bandwidth to route htlc

openoms avatar Dec 01 '25 22:12 openoms

Update: bos peers started working after running the commands above, but the lncli getchaninfo results still look the same.

bos fees does error still:

err: 
  - 503
  - GetFeeReportError
  - 
    err: 
      message: 2 UNKNOWN: 
      stack: 
        - Error: 2 UNKNOWN: 
        -     at callErrorFromStatus (/usr/lib/node_modules/balanceofsatoshis/node_modules/lightning/node_modules/@grpc/grpc-js/build/src/call.js:32:19)
        -     at Object.onReceiveStatus (/usr/lib/node_modules/balanceofsatoshis/node_modules/lightning/node_modules/@grpc/grpc-js/build/src/client.js:193:76)
        -     at Object.onReceiveStatus (/usr/lib/node_modules/balanceofsatoshis/node_modules/lightning/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:361:141)
        -     at Object.onReceiveStatus (/usr/lib/node_modules/balanceofsatoshis/node_modules/lightning/node_modules/@grpc/grpc-js/build/src/client-interceptors.js:324:181)
        -     at /usr/lib/node_modules/balanceofsatoshis/node_modules/lightning/node_modules/@grpc/grpc-js/build/src/resolving-call.js:135:78
        -     at process.processTicksAndRejections (node:internal/process/task_queues:77:11)
        - for call at
        -     at ServiceClientImpl.makeUnaryRequest (/usr/lib/node_modules/balanceofsatoshis/node_modules/lightning/node_modules/@grpc/grpc-js/build/src/client.js:161:32)
        -     at ServiceClientImpl.<anonymous> (/usr/lib/node_modules/balanceofsatoshis/node_modules/lightning/node_modules/@grpc/grpc-js/build/src/make-client.js:105:19)
        -     at asyncAuto.getFeeReport (/usr/lib/node_modules/balanceofsatoshis/node_modules/lightning/lnd_methods/offchain/get_fee_rates.js:47:33)
        -     at runTask (/usr/lib/node_modules/balanceofsatoshis/node_modules/async/auto.js:293:13)
        -     at /usr/lib/node_modules/balanceofsatoshis/node_modules/async/auto.js:233:31
        -     at processQueue (/usr/lib/node_modules/balanceofsatoshis/node_modules/async/auto.js:243:13)
        -     at taskComplete (/usr/lib/node_modules/balanceofsatoshis/node_modules/async/auto.js:259:9)
        -     at /usr/lib/node_modules/balanceofsatoshis/node_modules/async/auto.js:286:17
        -     at /usr/lib/node_modules/balanceofsatoshis/node_modules/async/internal/onlyOnce.js:12:16
        -     at validate (/usr/lib/node_modules/balanceofsatoshis/node_modules/lightning/lnd_methods/offchain/get_fee_rates.js:42:16)
        -     at runTask (/usr/lib/node_modules/balanceofsatoshis/node_modules/async/auto.js:295:13)
        -     at /usr/lib/node_modules/balanceofsatoshis/node_modules/async/auto.js:233:31
        -     at processQueue (/usr/lib/node_modules/balanceofsatoshis/node_modules/async/auto.js:243:13)
        -     at auto (/usr/lib/node_modules/balanceofsatoshis/node_modules/async/auto.js:230:5)
        -     at /usr/lib/node_modules/balanceofsatoshis/node_modules/lightning/lnd_methods/offchain/get_fee_rates.js:35:12
        -     at new Promise (<anonymous>)

lnd logs:

2025-12-01 22:57:28.360 [ERR] RPCS: [/lnrpc.Lightning/FeeReport]: no policy for outgoing channel 1017488060494774273 
2025-12-01 22:57:39.626 [ERR] RPCS: [/lnrpc.Lightning/FeeReport]: no policy for outgoing channel 1017488060494774273 

Restarted the node, but still showing those 11 channels without policy:

k -n blink-routing-routing  logs  -f  lnd1-0 -c lnd |  grep -i "edge\|policy\|error"
2025-12-01 23:03:19.840 [ERR] DISC: Unable to rebroadcast stale announcements: unable to retrieve outgoing channels: channel from self node has no policy
2025-12-01 23:03:23.035 [WRN] SRVR: No channel policy found for ChannelPoint(69d5b6db12ff4177a49783c66785df6824fcad62b9639bd9a46acd053a42801c:1): 
2025-12-01 23:03:23.087 [WRN] SRVR: No channel policy found for ChannelPoint(196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:3): 
2025-12-01 23:03:23.088 [WRN] SRVR: No channel policy found for ChannelPoint(6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:0): 
2025-12-01 23:03:23.090 [WRN] SRVR: No channel policy found for ChannelPoint(6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:1): 
2025-12-01 23:03:23.092 [WRN] SRVR: No channel policy found for ChannelPoint(6b9e46118b49b9120a4423cc5bef50475217681de1fe3f1404e1b40e2d665f7d:2): 
2025-12-01 23:03:23.093 [WRN] SRVR: No channel policy found for ChannelPoint(af73bba2a00dcc6b5e6d22b6b6c59948c75cb880b7a1c5de5df0a8319b99e7b7:1): 
2025-12-01 23:03:23.095 [WRN] SRVR: No channel policy found for ChannelPoint(870112b937e44d023e22cdca352daf908e9f4b149afdd4e48ef2638f1c5860a7:4): 
2025-12-01 23:03:23.097 [WRN] SRVR: No channel policy found for ChannelPoint(a3131b0014ebc9184d3697facd50b4b3dfd68fccf1148cf0ebc833ed73c11dc6:1): 
2025-12-01 23:03:23.098 [WRN] SRVR: No channel policy found for ChannelPoint(d77290ad97775b15f2f23e5cea49b785ccb172bc5608ef8f6913be6947da10c6:0): 
2025-12-01 23:03:23.106 [WRN] SRVR: No channel policy found for ChannelPoint(196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:1): 
2025-12-01 23:03:23.108 [WRN] SRVR: No channel policy found for ChannelPoint(196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:2): 2025-12-01 23:09:19.064 [ERR] HSWC: Unhandled error while reforwarding htlc settle/fail over htlcswitch: insufficient bandwidth to route htlc
2025-12-01 23:09:49.665 [ERR] HSWC: Unhandled error while reforwarding htlc settle/fail over htlcswitch: insufficient bandwidth to route htlc

openoms avatar Dec 01 '25 22:12 openoms

cc @ellemouton

ziggie1984 avatar Dec 01 '25 23:12 ziggie1984

Fixed only by closing two channels. Looking at the getchaninfo output for the remaining errored chanels node1_policy or node2_policy is still null:

Channel Own Node Position Own Policy
:1 (196f...) node2_pub null
:0 (6b9e...) node2_pub null
:2 (6b9e...) node1_pub null
:1 (af73...) node1_pub null
:4 (8701...) node1_pub null
:1 (a313...) node1_pub null
:0 (d772...) node1_pub null
:1 (69d5...) node1_pub null

Running updatechanpolicy for the channels with missing policies results in:

lncli updatechanpolicy --chan_point 196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:1 --base_fee_msat 0 --fee_rate_ppm 1000 --time_lock_delta 60 --min_htlc_msat 1000 --max_htlc_msat 148500000
00
{
    "failed_updates": [
        {
            "outpoint": "196f058ae8d244a5be3e4fe4b5563406fef30fa24825dd7f528c7506ff36835a:1",
            "reason": "UPDATE_FAILURE_NOT_FOUND",
            "update_error": "edge policy not found"
        }
    ]
}

openoms avatar Dec 02 '25 10:12 openoms

Could chantools dropchannelgraph be used here at all (given the postgres database)?

Adding some analyses made with Claude Opus 4.5 in case it is useful:

Root Cause Analysis

There are two bugs in lnd/routing/localchans/manager.go:

Bug 1: Missing code path for "edge exists but policy is nil"

The processChan function (lines 87-150) only handles:

  1. Edge found with policy → update it ✓
  2. Edge not found → try createMissingEdge (if flag set) ✓

Missing case: Edge found but own policy is nil → no code to create the policy!

When ForAllOutgoingChannels finds the channel but the policy is nil, it fails with:

Got nil channel edge policy when updating a channel

The --create_missing_edge flag only triggers when the channel is NOT found in the first pass. Since the edge exists, the channel IS found, so createMissingEdge is never called.

Bug 2: FundingScript not set in createEdge

When --create_missing_edge does try to recreate a missing edge (when edge is truly missing), it creates a ChannelEdgeInfo but never sets the FundingScript field:

info := &models.ChannelEdgeInfo{
    ChannelID:    shortChanID.ToUint64(),
    ChainHash:    channel.ChainHash,
    Features:     lnwire.EmptyFeatureVector(),
    Capacity:     channel.Capacity,
    ChannelPoint: channel.FundingOutpoint,
    // FundingScript is NOT set!
}

When AddEdge is called in lnd/graph/builder.go (line 1064), it tries to unwrap FundingScript:

fundingPkScript, err := edge.FundingScript.UnwrapOrErr(fmt.Errorf(
    "expected the funding transaction script to be set",
))

The Fix

  1. Bug 1 fix: In processChan, when edge == nil, instead of just failing, call createMissingEdge to create the policy (not just the edge).

  2. Bug 2 fix: In createEdge, derive and set the FundingScript using the channel's multisig keys. Similar logic exists in:

    • lnd/funding/manager.go:makeFundingScript()
    • lnd/contractcourt/chain_watcher.go:deriveFundingPkScript()

openoms avatar Dec 02 '25 11:12 openoms

@openoms - do you still have the logs from the first start-up with 0.20? (ie, the startup showing the migration logs) wondering if any warnings appeared there

ellemouton avatar Dec 02 '25 16:12 ellemouton

@ellemouton thank you for looking into this:

Got the logs, there are some SQL errors which could be possible related:

2025-11-28 21:21:10.514 [INF] GRDB: Pruned unconnected node 034efff7008339c9080e71a6bda4bf81799b7274a8b1545a39fc9c80141fb399c8 from channel graph
2025-11-28 21:21:10.515 [INF] GRDB: Pruned 1 unconnected nodes from the channel graph
2025-11-28 21:24:28.788 [WRN] GRDB: Node=02a624a8019bb184d208045c07e511a571dfd6d9f1fb6c590c0f8ce5c424748f38 not found in graph cache
2025-11-28 21:24:31.715 [WRN] GRDB: Node=02d1fa530f8303e6dcaecc4efe22e98b4db1dece72192a24004dcb9f4607051ba0 not found in graph cache
2025-11-28 21:24:38.483 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 21:24:38.977 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 21:24:39.608 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 21:24:40.240 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 22:25:04.503
lnd
2025-11-28 21:25:04.503 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 22:33:54.436
lnd
2025-11-28 21:33:54.436 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 22:33:55.175
lnd
2025-11-28 21:33:55.175 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 22:34:13.206
lnd
2025-11-28 21:34:13.205 [WRN] GRDB: Node=02704f6cfc96c6f9798761b65627421650f544859756ef92032776568678ccc69c not found in graph cache
2025-11-28 22:41:57.762
lnd
2025-11-28 21:41:57.761 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 22:48:14.471
lnd
2025-11-28 21:48:14.470 [WRN] GRDB: Node=03c25c44b4cf5c19b3b45d61da5bd62e9ddfb6d923bb49c82872f50dbc3261d7dc not found in graph cache
2025-11-28 22:51:34.472
lnd
2025-11-28 21:51:34.472 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 22:55:49.560
lnd
2025-11-28 21:55:49.560 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 22:59:59.568
lnd
2025-11-28 21:59:59.568 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 22:59:59.968
lnd
2025-11-28 21:59:59.968 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 23:00:04.473
lnd
2025-11-28 22:00:04.473 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 23:03:29.739
lnd
2025-11-28 22:03:29.739 [WRN] GRDB: Node=03aab463365767af8aae383a20c3c62f49d76eefd100baaa5690f570dfe0da14de not found in graph cache
2025-11-28 23:16:18.487
lnd
2025-11-28 22:16:18.486 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 23:20:48.311
lnd
2025-11-28 22:20:48.310 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 23:21:56.283
lnd
2025-11-28 22:21:56.283 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 23:22:21.090
lnd
2025-11-28 22:22:21.089 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 23:22:37.987
lnd
2025-11-28 22:22:37.987 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 23:22:43.539
lnd
2025-11-28 22:22:43.539 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 23:25:47.792
lnd
2025-11-28 22:25:47.792 [ERR] GRDB: UpdateEdgePolicy faild: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-11-28 23:37:43.128
lnd
2025-11-28 22:37:43.127 [WRN] GRDB: Node=02677674424ee4ca345c2062117354b554f7a5529cc0124393b4ae1ee47944be6e not found in graph cache
2025-11-28 23:38:34.256
lnd
2025-11-28 22:38:34.255 [INF] GRDB: Pruned unconnected node 037ff43cb2021a83ec0b35bc9b26d9e94dae5abf28724e80583a876ee11870696a from channel graph
2025-11-28 23:38:34.260
lnd
2025-11-28 22:38:34.260 [INF] GRDB: Pruned unconnected node 027b3c0b8736a9040f725420aad08e9a6d41b9892231576aed9f170607c4c2dace from channel graph
2025-11-28 23:38:34.262
lnd
2025-11-28 22:38:34.261 [INF] GRDB: Pruned 2 unconnected nodes from the channel graph
2025-11-28 23:41:41.671
lnd
2025-11-28 22:41:41.671 [INF] GRDB: Starting migration of the graph store from KV to SQL
2025-11-28 23:43:01.132
lnd
2025-11-28 22:43:01.132 [INF] GRDB: Migrated 17302 nodes from KV to SQL in 1m19.459063767s (skipped 0 nodes due to invalid TLV streams or invalid addresses)
2025-11-28 23:43:01.153
lnd
2025-11-28 22:43:01.153 [INF] GRDB: Migrated source node with pubkey 02dfb4c1dd59216fa6a28d0f012e188516f63517db68c4e4b82c3af41343a05bc4 to SQL
2025-11-28 23:45:27.611
lnd
2025-11-28 22:45:27.610 [INF] GRDB: Migrated 51533 channels and 81366 policies from KV to SQL in 2m26.457773371s(skipped 0 channels and 0 policies due to invalid TLV streams)
2025-11-28 23:45:40.029
lnd
2025-11-28 22:45:40.029 [INF] GRDB: Migrated 13263 prune log entries from KV to SQL in 12.418630941s. The prune tip is: height 925597, hash: 0000000000000000000009ecf77fbd90deaea0ab35392e6dfd86e308e68e2783
2025-11-28 23:47:13.048
lnd
2025-11-28 22:47:13.047 [INF] GRDB: Migrated 101579 closed SCIDs from KV to SQL in 1m33.018362931s
2025-11-28 23:48:31.611
lnd
2025-11-28 22:48:31.611 [INF] GRDB: Migrated 667 zombie channels from KV to SQL in 1m18.563340295s
2025-11-28 23:48:31.611
lnd
2025-11-28 22:48:31.611 [INF] GRDB: Finished migration of the graph store from KV to SQL in 6m49.939451639s
2025-11-28 23:48:55.889
lnd
2025-11-28 22:48:55.889 [INF] GRDB: Populating in-memory channel graph, this might take a while...
2025-11-28 23:48:56.743
lnd
2025-11-28 22:48:56.743 [INF] GRDB: Finished populating in-memory channel graph (took 850.205802ms, num_node_features=17302, num_nodes=16988, num_channels=101026)
2025-11-28 23:52:17.944
lnd
2025-11-28 22:52:17.944 [WRN] GRDB: Node=03e18892b70da265e79a0170825d358a4dfcbe93e15070a937959275e638bef503 not found in graph cache
2025-11-28 23:01:17.955 [WRN] GRDB: Node=03d53ff9ccc6f8bd950eccc96f53b490d88510a1fa4b451a516e420e0863164613 not found in graph cache
2025-11-28 23:02:31.982 [WRN] GRDB: Node=03d53ff9ccc6f8bd950eccc96f53b490d88510a1fa4b451a516e420e0863164613 not found in graph cache
2025-11-28 23:03:29.494 [WRN] GRDB: Node=0350152ede0ec135c80dd6095f4b264fbf9cd8b2ac2e33b7c004f33fc82cd33629 not found in graph cache

I see this SQLSTATE 40001 come up occasionally ever since:

2025-12-02 06:46:33.357 [ERR] GRDB: UpdateEdgePolicy faild: inserting chan policy extra TLVs: unable to insert channel_policy(58023) extra signed field(55555): ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-12-02 14:07:18.891 [ERR] GRDB: UpdateEdgePolicy faild: unable to upsert edge policy: ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-12-02 14:37:19.143 [ERR] GRDB: UpdateEdgePolicy faild: inserting chan policy extra TLVs: unable to insert channel_policy(19681) extra signed field(55555): ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)
2025-12-02 15:58:19.269 [ERR] GRDB: UpdateEdgePolicy faild: inserting chan policy extra TLVs: unable to insert channel_policy(80828) extra signed field(55555): ERROR: could not serialize access due to read/write dependencies among transactions (SQLSTATE 40001)

The postgres database is a GCP managed instance with these flags:

 auto_explain.log_buffers : on
max_locks_per_transaction : 128
wal_compression : zstd
checkpoint_timeout : 600
work_mem : 131072
auto_explain.log_analyze : on
random_page_cost : 1.1
auto_explain.log_min_duration : 500
max_connections : 100
autovacuum_vacuum_cost_limit : 2000
max_pred_locks_per_transaction : 1024

openoms avatar Dec 02 '25 19:12 openoms

can you share you complete log via DM (on slack for example) ?

ziggie1984 avatar Dec 02 '25 19:12 ziggie1984

@openoms still waiting for some logs, we want to eventually fix the problem in 20.1, so it would be very helpful if we can get more data.

ziggie1984 avatar Dec 03 '25 19:12 ziggie1984

can you share you complete log via DM (on slack for example) ?

shared in Slack DM @ziggie1984

One more thing to note maybe that the postgres database instance CPU usage was close to 100% at the time which cause some increased waits in the operations.

openoms avatar Dec 04 '25 07:12 openoms

I'm experiencing the same issue but with sqlite, this is for all 8 of my channels. I upgraded from 0.19.3 to 0.20.0, then used lndinit to migrate from bbolt to sqlite, then to native SQL. lndinit reported "Migration of all mandatory db parts completed successfully". listchannels reports the channels are active.

Do I need to take action to secure the funds?

20a8vnf30 avatar Dec 06 '25 23:12 20a8vnf30

@20a8vnf30 your node should be fine, its just something with the graph data

ziggie1984 avatar Dec 07 '25 18:12 ziggie1984

Can confirm that funds are not at risk. I have cooperatively closed all the problematic channels with missing chanpolicy on either side. Now the fee setting functions are back.

openoms avatar Dec 07 '25 21:12 openoms

You can fix the graph data via: https://github.com/lightningnetwork/lnd/pull/10410

I will open a separate issue investigating why sometimes the channel policy is missing in our own graph data.

ziggie1984 avatar Dec 14 '25 08:12 ziggie1984