Fulcrum icon indicating copy to clipboard operation
Fulcrum copied to clipboard

Is it be possible to give txnum2txhash the same structure on disk as txhash2txnum?

Open hMsats opened this issue 3 years ago • 7 comments

Me again :-)

Would it be possible to store the flat file txnum2txhash as smaller files like it is done in the directory txhash2txnum? The reason for this question is that I backup my whole system every month via rsync which only copies the files that have changed. This is usually a fraction of the total number of files on my system and results in a huge speedup.

For the bitcoin blockchain I perform multiple rsyncs until nothing is copied anymore and then "repair" the blockchain running the software on a different computer. In this way I'm able to keep the bitcoin software running while making a backup and keep all my connections.

From what I've seen, a similar trick is possible for Fulcrum but it's the big 21 GB txnum2txhash that makes it hard/impossible. But even if it's necessary to stop Fulcrum, do a final rsync and then restart Fulcrum, the downtime would be much smaller.

More in general, I think for backup purposes it's always a good idea to keep files on disk relatively small.

hMsats avatar Nov 20 '21 07:11 hMsats

Hi, sorry it took so long to reply. Yes it's possible. Doesn't rsync do binary diffs and only copy subsets of files? Or.. does it copy the whole files?

From what I've seen, a similar trick is possible for Fulcrum

Hmm. It's only a matter of time before you get some corruption there but.. ok. Surprised that works at all... who knows what rocksdb does in the background. It's constantly writing to disk so.. sounds scary that it works. Ha ha.

More in general, I think for backup purposes it's always a good idea to keep files on disk relatively small.

Yeah, probably. I'll think about implementing this at some point. It would require some engineering to get right and also be configurable by the user. Likely if user turns it on, the app will have to "split" the file on the spot at startup. And if user turns it off, or adjusts a param, it would unsplit/resplit.. or something.

cculianu avatar Dec 02 '21 14:12 cculianu

Hi, sorry it took so long to reply.

No problem. It's an enhancement, not a bug.

Doesn't rsync do binary diffs and only copy subsets of files? Or.. does it copy the whole files?

Ah, I now see that it's possible to copy only the delta of very big files. I hadn't found it before using google. It is not the default behavior of rsync but it's possible with the right options. I will try that out first. Nevertheless, keeping files on disc relatively small is still better. I'll report back.

hMsats avatar Dec 02 '21 18:12 hMsats

Eh, better for what exactly? A file is just an entry in a table. The disk blocks are the same. If anything it's slightly faster to not have to have a big table for lookups, and slightly easier on the code to not have to keep a map of the files and what they contain.... :)

Anyway the txnum2txhash file grows only at the end, so rsync should be able to do a binary diff quite easily, since only the end of the file changes.

I will think about this enhancement for future, if anything, because it's a fun programming challenge.

cculianu avatar Dec 02 '21 19:12 cculianu

Did some research and my conclusion is that for backup purposes breaking up the large file txnum2txhash into smaller files is definitely better ...

hMsats avatar Dec 03 '21 18:12 hMsats

Ah ok gotcha. Thanks for the follow-up. I'll see about implementing that, time permitting.

cculianu avatar Dec 04 '21 01:12 cculianu

Just letting you know I haven't forgotten about this issue and do plan on addressing it. Probably Fulcrum will have an opt-in conf option for this, and if you select it, it will convert your 1 flat file into a directory with many files. Probably the user will just be able to select the size of each file or something or maybe I will just peg it at 100MB to make it easy?

cculianu avatar Feb 05 '22 01:02 cculianu

Thanks a lot for letting me know. Yes, 100MB would be fine. It's not a big issue at the moment but it will get more of a nuisance when the file gets even bigger. When you have time. I've been running Fulcrum for many months now and it works perfectly and fast!

hMsats avatar Feb 05 '22 07:02 hMsats