lightning icon indicating copy to clipboard operation
lightning copied to clipboard

Pruning problem

Open kroese opened this issue 3 years ago • 17 comments

I did a fresh installation of bitcoind with pruning set to 100gb. I waited untill it was completely synced. Then I installed CLN and connected to the bitcoind node.

The problem is that for the last hour it keeps requesting the same block from 2018 every second in a loop:

UNUSUAL plugin-bcli: /usr/bin/bitcoin-cli -datadir=/data/.bitcoin -rpcconnect=172.17.0.2 -rpcport=8332 -rpcuser=... -rpcpassword=... getblock 00000000000000000005f7a06bd4efe545999aba00eeff9a49747a3cd1f3c9df 0 exited with status 1

I don't understand why it keeps on trying this same block, since it should realize it's not available after trying only once. I think it heard of this block through channel gossip, since I don't have any channels yet myself.

Besides the problem with CLN getting stuck on this block, I think I will have another problem.

Namely that my graph will miss all channels created more than a year ago? I thought a pruned node would be fully functional, but if I miss all the old channels it is a big downside.

So is my mistake that I should have already started CLN while Bitcoin was still syncing the chain? That way CLN would have had access to the blocks from 2018 that are now pruned. Or is there no solution?

getinfo output

{
   "id": "xxxx",
   "alias": "xxxx",
   "color": "xxxxxx",
   "num_peers": 1,
   "num_pending_channels": 0,
   "num_active_channels": 0,
   "num_inactive_channels": 0,
   "address": [
      {
         "type": "ipv4",
         "address": "xx.xx.xx.xx",
         "port": 9760
      }
   ],
   "binding": [
      {
         "type": "ipv4",
         "address": "0.0.0.0",
         "port": 9735
      }
   ],
   "version": "v0.10.2",
   "blockheight": 739674,
   "network": "bitcoin",
   "msatoshi_fees_collected": 0,
   "fees_collected_msat": "0msat",
   "lightning-dir": "/data/.lightning/bitcoin"
}

kroese avatar Jun 07 '22 07:06 kroese

Can you check if your bitcoin instance has the block 00000000000000000005f7a06bd4efe545999aba00eeff9a49747a3cd1f3c9df? for pruning we have better alternative like https://github.com/clightning4j/btcli4j or other backend listed in https://github.com/lightningd/plugins

In addition, I think that the two problems that you have are related, in particular I think that in the last year the blockchain grows more than 100 GB

vincenzopalazzo avatar Jun 07 '22 08:06 vincenzopalazzo

@vincenzopalazzo No, this block is from 2018 and I have only blocks from a year ago.

I am just trying to understand:

  • Why does it need this block? I have zero channels so it is not needed for my channels.. Does it need to verify the opening transaction for EVERY channel in the graph?

  • Why does it keep requesting it for hours? It should just fail once, and ignore the channel I suppose?

I would rather not switch to another backend. I thought pruning was fully supported as long as you make sure C Lightning does not get behind too far.

kroese avatar Jun 07 '22 08:06 kroese

I did some more research and it seems indeed that the mistake was to wait until the IBD was completed, before starting C-Lightning. I should have let them run together while syncing. But this introduces other problems as C-Lightning syncs slower than Bitcoind and can get behind too far.

The best solution would if C-Lightning just implemented the getblockfrompeer RPC call that was recently added to Bitcoin.

So now my only option is to connect C-Lightning to an external full node (without pruning) to let it validate all the channels in the graph.

That leads me to the final question:

Is it safe to switch C Lightning from the unpruned node back to the pruned node after it validated all the channels? And how do I know it has finished validating every channel in the graph, so that I have the garantuee it will never need an old block again?

kroese avatar Jun 07 '22 09:06 kroese

The best solution would if C-Lightning just implemented the getblockfrompeer RPC call that was recently added to Bitcoin

this is an interesting idea as a fallback to getblock. cln on pruned nodes has always been a huge pain.

jb55 avatar Jun 07 '22 17:06 jb55

this is an interesting idea as a fallback to getblock. cln on pruned nodes has always been a huge pain.

Working to translate it in a compiled language (really compiled)

vincenzopalazzo avatar Jun 07 '22 17:06 vincenzopalazzo

Is it safe to switch C Lightning from the unpruned node back to the pruned node after it validated all the channels? And how do I know it has finished validating every channel in the graph, so that I have the garantuee it will never need an old block again?

I think if you have old channel you need to verify them, so if you have a channel old 10 years can be a problem, However, I'm not 100% sure about that.

cc @cdecker

vincenzopalazzo avatar Jun 07 '22 17:06 vincenzopalazzo

So is my mistake that I should have already started CLN while Bitcoin was still syncing the chain? That way CLN would have had access to the blocks from 2018 that are now pruned. Or is there no solution?

My approach is to start bitcoind and then CLN while it is still syncing. I noticed that bitcoind prunes faster than CLN processes blocks, use this script also constantly running as a workaround (it has locking, so just add * * * * * /home/cln/cln-prune-protector.sh 10000 >> /home/cln/cln-prune-protector.log 2>&1 to crontab), it will temporary disable bitcoind network activity if CLN is falling too much behind. https://github.com/kristapsk/cln-scripts/blob/master/cln-prune-protector.sh

The best solution would if C-Lightning just implemented the getblockfrompeer RPC call that was recently added to Bitcoin.

Kinda sounds right, but from my experience it will make CLN sync a lot slower, as at for most of the sync time it will ask for every block that way.

kristapsk avatar Jun 07 '22 17:06 kristapsk

slow is better than broken. there has been ideas thrown in the past about using keep-blocks but then you run into disk space back pressure which might run out. I see your script is turning the network on and off... seems a bit extreme but it's an interesting approach.

jb55 avatar Jun 07 '22 17:06 jb55

@kristapsk Yes, I saw your script and really liked it. But since I am running both Bitcoin and C-Lightning in separate docker containers, I would need to heavily modify the script to be able to use it from the host machine.

Also I am not sure if the script will make the process 100% watertight. Because it would require a garantuee that CLN received all channel gossip before reaching the related blocks. But if it receives an additional old channel after that, it will still fail to get block. I don't know if there is a way to be sure that you received all gossips about every channel ever created. And even if there is, there is always the possibility that someone broadcast a new channel with a very old funding transaction.

kroese avatar Jun 07 '22 17:06 kroese

I would need to heavily modify the script to be able to use it from the host machine.

Not sure about that. What you need is working both bitcoin-cli and lightning-cli on a CLN container. And CLN itself depends on a working bitcoin-cli, right?

Script was actively doing turning on / off during IBD, afterwards it haven't done turning off (but it would if, for example, CLN service would not be running). I have prune=20000 in bitcoin.conf on that specific VPS where I use it.

kristapsk avatar Jun 07 '22 17:06 kristapsk

prune=anynumber is unsafe with CLN for reasons you identified above.

$ bitcoin-cli help pruneblockchain
pruneblockchain height



Arguments:
1. height    (numeric, required) The block height to prune up to. May be set to a discrete height, or to a UNIX epoch time
             to prune blocks whose block time is at least 2 hours older than the provided timestamp.

Result:
n    (numeric) Height of the last block pruned

Examples:
> bitcoin-cli pruneblockchain 1000
> curl --user myusername --data-binary '{"jsonrpc": "1.0", "id": "curltest", "method": "pruneblockchain", "params": [1000]}' -H 'content-type: text/plain;' http://127.0.0.1:8332/

The dependent app (in this case CLN) should instead be driving bitcoind's pruning with this RPC. With CLN in control of pruning you are never at risk of bitcoind pruning too far ahead.

wtogami avatar Jun 13 '22 18:06 wtogami

You are right. But even though I made the mistake of letting Bitcoin sync first, it is still a bug that CLN tried to request the same block for hours in a loop.

It would made have much more sense to skip the blocks and ignore the related channels, instead of going into a deathloop.

kroese avatar Jun 13 '22 18:06 kroese

I've been running on pruned mode successfully, but periodically it hits this bug. It's weird and appears possibly because of malicious gossip because it is always referencing a block from years ago, when lightning channels were only a glimmer in a nerds eye.

I've found a work around because on it's own it seems to get stuck in a loop requesting a block that doesn't exist and all the other node activity slows down. There are a couple of plugins that are meant to make running on a pruned node more reliable*. Although I have never been able to get btcli4j actually configured properly sync (it seems unable to fetch blocks) - just starting up clightning with that plugin clears the queue on fetching that block and then allows me to start up normally again.

  • https://github.com/clightning4j/btcli4j/tree/ecacb049d41e2282c5595e84a6f9db6a601c3bc3

AutonomousOrganization avatar Jun 13 '22 22:06 AutonomousOrganization

Sounds like a redundant fallback lookup for old blocks would be a perfect plugin.

wtogami avatar Jun 14 '22 00:06 wtogami

https://github.com/clightning4j/btcli4j/tree/ecacb049d41e2282c5595e84a6f9db6a601c3bc3

I get "This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository."

kristapsk avatar Jun 14 '22 10:06 kristapsk

https://github.com/clightning4j/btcli4j/tree/ecacb049d41e2282c5595e84a6f9db6a601c3bc3

I get "This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository."

It is outside this repository - it is from the list of community plugins : https://github.com/lightningd/plugins

AutonomousOrganization avatar Jun 15 '22 17:06 AutonomousOrganization

@kristapsk @AutonomousOrganization Just use the master branch https://github.com/clightning4j/btcli4j

There are a couple of plugins that are meant to make running on a pruned node more reliable*. Although I have never been able to get btcli4j actually configured properly sync (it seems unable to fetch blocks)

I put all my effort to keep alive and maintain my tool, but I can not dream of the bug that people have if you open an issue I can help you to configure it.

Disclaimer, there is a really configuration :) just a flag to run in pruning mode

vincenzopalazzo avatar Jun 15 '22 19:06 vincenzopalazzo

Just noticed the same issue on my new pruned node. Is there a reason why the official docs on pruning doesn't mention this issue? It looks like a common situation with non-negligible negative consequences.

Is it considered a bug or wontfix? Is it safe to ignore it, assuming bitcoind and lightningd agree on a current block height and it's up-to-date?

bubelov avatar Nov 22 '22 06:11 bubelov