rusty-blockparser icon indicating copy to clipboard operation
rusty-blockparser copied to clipboard

Resuming the balances callback from a previous version of the blockchain

Open Goro2030 opened this issue 4 years ago • 10 comments

This is a BIP :)

Change request would be to enable the user to resume from a certain block, so he/she doesn't have to process all the blockchain each time they want to get a new version of the "balances", with a blockchain that have grown.

This seems to be easily achievable by loading the previous balances.csv file, using the naming convention ( balances0..641023.csv ), the software can determine which was the last block that was ingested, then load all the balances into the hash table in memory, and start ingesting from the following one until the end, then dump all of the hash table back to disk.

Goro2030 avatar Jul 24 '20 22:07 Goro2030

That would be a nice feature. I'm not 100% sure if the balances.csv is enough to resume, if not the unspent.csv can be used for that. If this is necessary then the balances command should also dump the unspent.csv, it shouldn't require much code changes since everything is in place already besides the de-serialization.

gcarq avatar Aug 02 '20 22:08 gcarq

Well, why wouldn't balances.csv work? If you have all the existing addresses WITH a balance in that file up to the last block you scanned, then starting the analysis from the next block and just add to the existing address pool from balances.csv (and add any new address seen from there onwards ) would do it.

Goro2030 avatar Aug 03 '20 05:08 Goro2030

The tx output script doesn't contain the sender address, only the recipient. So if there is a new utxo which spents btc that are already in the balances.csv its hard to find the correct address to subtract the spent balance from.

gcarq avatar Aug 03 '20 10:08 gcarq

I see what you're saying ... the new tx output script doesn't contain the actual address of the sender address, but the transaction ID's that compose it. Here's when having the blockchain indexed will become handy, because you would have to go find each transaction for this "update the tip of the blockchain process" from now on, once you did the initial indexing.

Let's put down an example:

Address1 has 1 BTC Address2 has 2 BTC Address3 has 3 BTC

All of that is stored in the balances.csv file. And in the blockchain index, TXID34 states that Address2 owns the 2BTC.

Now you process a new block and see a UTXO that spends 0.5 BTC from TX ID 34 and sends it to Address3. By looking at TXID34, you can determine that the owner is Address2, and now you have all 2 account's balances, so you decrease one and increase the other "in memory", and the final balance after such UTXO would be:

Address1 has 1 BTC Address2 has 1.5 BTC Address3 has 3.5 BTC

This will generate a lot of "out of sequence/Random" disk access requests, making this module very hard on traditional hard drives, and mostly only suitable for SDD's.

On Mon, Aug 3, 2020 at 6:17 AM Michael Egger [email protected] wrote:

The tx output script doesn't contain the sender address, only the recipient. So if there is a new utxo which spents btc that are already in the balances.csv its hard to find the correct address to subtract the spent balance from.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/gcarq/rusty-blockparser/issues/66#issuecomment-667940632, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACS7DGXAZYJKEQ744OV6KJ3R62FEVANCNFSM4PHDT23A .

Goro2030 avatar Aug 03 '20 14:08 Goro2030

Good afternoon. I am also interested in this question. I unloaded the balances up to block 808500. But unfortunately, I still don’t understand how to continue from this block so that the balance was calculated correctly. And so that I don't have to start over from block 0.

I tried doing smth like this: --start 808500 --blockchain-dir C:.. balances C:..

But when I work with files balances-0-808500 and balances-808501-808540 I get the final balance incorrect.

t0nyM0 avatar Sep 20 '23 12:09 t0nyM0

@gcarq sorry for my possibly inappropriate question. I will be very grateful to you if you answer my question. For correct work with balances, is it possible not to check them every time from the very beginning, or not.

t0nyM0 avatar Sep 21 '23 00:09 t0nyM0

@t0nyM0 This is not possible at the moment, because balances only outputs the final balances.csv and discards all UTXOs. So if you resume at a given blockheight, the parser doesn't know about UTXOs in previous blocks and this leads to wrong balances. To make this happen, the current UTXOs (when finished) need to be serialized (e.g.: unspents-0-808500.csv) and loaded again when resuming.

This is a fairly easy change in theory, any pull requests to make this happen are welcome.

gcarq avatar Sep 21 '23 08:09 gcarq

@gcarq Thank you very much for your answer! Could you share your email? I have a few private questions for you.

t0nyM0 avatar Sep 21 '23 09:09 t0nyM0

@t0nyM0 Sure! You can find my email address in the commit details when you look for Signed-off-by: https://github.com/gcarq/rusty-blockparser/commit/dffee88e24ecefc000862e68e6e19bfa29ff1976.patch

gcarq avatar Sep 21 '23 09:09 gcarq

@gcarq Please check your e-mail. I sent you letter. My mail: [email protected]

t0nyM0 avatar Sep 22 '23 20:09 t0nyM0