slower payments with clboss running
I am running node 02d695b01c7a6909e716c863fb39bc5fb7bbdc3824b7fdce53adc593e5be080e73 and have re-started clboss after several months of running the node without it. Clboss has been probing the network for three days straight and keeping topology at 100% CPU load. I also notice that payments (small and large) take significantly longer (20-30 s) than without clboss running (2-10s). Is this expected behavior, or must I change anything in my configuration?
What version of CLBOSS are you running?
lightning-cli clboss-status
I am running clboss version v0.14.1-rc2 over lightning version v24.08-modded.
I've noticed topology being very busy (~90-100% CPU) on my node as well. My node has two cores, so one process being 100% consumes half the total available CPU ...
How many cores does your system have?
The VM has four cores. I am okay with having one core at 100% as long as it's productive (I assume there is a lot to measure for 200 channels). The problem is the responsiveness and waiting 20-30 seconds for a payment to clear, which usually takes only a few seconds.
I totally agree, I was making sure we weren't simply starving a single CPU ...
Another resource I've had to tune is memory ... any chance your system is paging?
I run sar -W 60 in a window to make sure pswpin/s isn't more than single digits ...
Thank you, Ken, for looking into this.
I have to walk back my claim that clboss is slowing payments down. After I restarted clboss, I now get comparable performances ("tested" by zapping notes of the same nostr account). Also, "topology" does not run at 100% but at around 35%. I am trying to understand why it is different this time, and I will check again with clboss running for at least 24 hours. If it stays like this, I will be pretty happy.
24 hours later: still everything working fine, topology at < 50%, no paging, payments without delay
The sar measurements with and without clboss are not significantly different. Without clboss, it is mostly 0.00 and sometimes non-zero, peaking at 0.28.
Another observation that may or may not be relevant is that this time, compared to the last time I wrote, ChannelFinderByPopularity seems to be making real progress [plugin-clboss: ChannelFinderByPopularity: Progress: 10817 / 17348 (0.623530)]. Before, every time I checked, it seemed just to have started working on the first percent.
Here is my sar output with clboss running and with all 0.00 measurements deleted:
marius@mlbb2:~$ sar -W 60
Linux 5.15.0-124-generic (mlbb2) 12/03/2024 _x86_64_ (4 CPU)
08:17:14 AM pswpin/s pswpout/s
08:18:14 AM 0.07 0.00
08:21:14 AM 0.03 0.00
08:24:14 AM 0.07 0.00
08:25:14 AM 0.03 0.00
08:29:14 AM 0.02 0.00
08:30:14 AM 0.22 0.00
08:34:14 AM 0.20 0.00
08:40:14 AM 0.28 0.00
Your paging looks good. Hmm.
I need to learn why CLBOSS is so topology heavy sometimes.
Also, CLN is improving pay and xpay in the v24.11 release; need to understand if CLBOSS should/could use askrene ...
@ksedgwic it looks like CLN can be set to have xpay take over for pay by setting the xpay-handle-pay config option: https://github.com/ElementsProject/lightning/releases/tag/v24.11rc2
Cool. On the command line, I have lately been using renepay. Is there a good article that explains the difference between these new options?