clio icon indicating copy to clipboard operation
clio copied to clipboard

Deal with missing transactions

Open cjcobb23 opened this issue 3 years ago • 14 comments

There are some old transactions that rippled cannot deserialize, and thus does not return to clio via ETL. We need a way to get these transactions though, at least as a simple blob, and store them in the database, and to also return them as a simple blob.

cjcobb23 avatar Feb 14 '22 21:02 cjcobb23

Do you have any examples of these? I have just binary deserialized all transactions using ripple-binary-codec, and have not had any issues (https://github.com/XRPLF/xrpl.js/tree/main/packages/ripple-binary-codec#readme).

Silkjaer avatar Mar 23 '22 15:03 Silkjaer

Here are some ledgers that appear to contain transactions that cannot be deserialized. Rippled does not even return these transactions over the API, in any form. It catches an exception, and just returns what it can deserialize. The only way we caught this is we were recomputing hashes for the transaction map for every ledger, and a few hashes were incorrect, pointing to missing transactions.

COULD NOT VERIFY LEDGER TX 562177
COULD NOT VERIFY LEDGER TX 6409247
COULD NOT VERIFY LEDGER TX 7266393
COULD NOT VERIFY LEDGER TX 7266396

cjcobb23 avatar Mar 23 '22 16:03 cjcobb23

I don't know what the transactions are themselves. We would need to modify the rippled code to at the very least print out the tx blob when it tries to deserialize and catches an exception.

cjcobb23 avatar Mar 23 '22 16:03 cjcobb23

Looks like a rippled problem. It should be fixed at the source, not in clio.

godexsoft avatar Jan 12 '23 14:01 godexsoft

I don't think it makes sense to close this. The issue is not fixed, because clio is missing transactions. Which means clio can not fulfill it's API promise. I don't think it's relevant what the cause is. Clio is still not behaving as promised or desired.

The route forward here would be to modify the gRPC handlers in rippled which clio uses to extract data, and to at least return the transactions as raw binary. Don't even try to deserialize them. While yes, this code lives in rippled, it was written exclusively for clio, by me, and is a part of rippled that is really owned by clio and the clio team. @injaelee

cjcobb23 avatar Jan 12 '23 16:01 cjcobb23

@cjcobb23 we are happy to work towards fixing it if you can provide more info on reproducing this.

godexsoft avatar Jan 13 '23 13:01 godexsoft

You have to find a rippled server with enough history, and extract one of these ledgers:

562177
6409247
7266393
7266396

clio will just skip over the bad transaction, but the verifier script will throw an error when you try to verify the ledger.

Also, running the verifier script on a full history clio server will let you know which ledgers have transactions that cause this.

cjcobb23 avatar Jan 19 '23 23:01 cjcobb23

I seem to be running into this issue synching from a particular start sequence, but clio 2.0 won't progress beyond the ledger sequence that has the unserializable transactions. For me, clio can't deserialize ledger 75449940 thus doesn't proceed to the next ledger sequence. Temporary fix is to just modify the clio keyspace table ledger_range to skip over the unserializeable ledger sequence. It would be nice to get some closure on this issue since it's been lingering for so long.

ajkagy avatar Dec 05 '23 03:12 ajkagy

I seem to be running into this issue synching from a particular start sequence, but clio 2.0 won't progress beyond the ledger sequence that has the unserializable transactions. For me, clio can't deserialize ledger 75449940 thus doesn't proceed to the next ledger sequence. Temporary fix is to just modify the clio keyspace table ledger_range to skip over the unserializeable ledger sequence. It would be nice to get some closure on this issue since it's been lingering for so long.

Hi @ajkagy , Thanks for reporting this. To help us reproduce the issue, It will be very helpful if you can provide the below information: 1 Clio's error log 2 The ETL rippled's error log 3 The Clio and its ETL rippled's version

cindyyan317 avatar Dec 05 '23 11:12 cindyyan317

thanks for the quick reply here @cindyyan317

Here's an attached clio log from startup which seems like it chokes on ledger 75681445 after 10 retry attempts, adds other ledgers to the ETL queue, but then the ledger_range never updates to skip the ledger that it can't receive despite rippled having this validated ledger in it's db.

Edit: Rippled version: 1.12.0 clio version: clio-2.0.0 cassandra version: 4.1.3

image

clio.log

working on pasting my rippled ETL log here.

ajkagy avatar Dec 05 '23 15:12 ajkagy

Adding another note here. A common denominator seems to be the ledgers where that have a large amount of NFT mint txns where clio can't seem to progress. Trying to determine if it's a downstream cassandra issue and not necessarily clio.

here's a few more ledgers that basically stop clio from progressing. https://bithomp.com/ledger/75755384 https://bithomp.com/ledger/75755417

ajkagy avatar Dec 05 '23 17:12 ajkagy

@cindyyan317 update: I was able to produce this on testnet with the exact same riddled, clio and cassandra versions. clio stops progressing.

here is the ledger: https://test.bithomp.com/ledger/43473563

This seems like a completely different issue altogether, but couldn't find an existing issue open for this.

ajkagy avatar Dec 05 '23 17:12 ajkagy

@ajkagy Thanks for the detail. We can't repro this , the problematic ledgers can be processed by our nodes. From the log, it seems like got stuck when writing the db.

When the node was upgraded to Clio 2.0 ? Does it start fresh or migrated? Can you also open the Backend log?

cindyyan317 avatar Dec 05 '23 19:12 cindyyan317

@cindyyan317 node is fresh. I'm pretty positive this is a configuration thing based on the particular cassandra version and not a Clio issue since we're having no issues with earlier cassandra versions and scylla (which uses an earlier cassandra version). I'll try and get some more detailed logs together.

Thanks for your help!

ajkagy avatar Dec 05 '23 20:12 ajkagy