cardano-db-sync
cardano-db-sync copied to clipboard
Support `tx_out.tx_hash` field
Currently the tx_out table contains a tx_id field, with the autoincremental id of the tx it belongs to. However this means if we want to resolve an input, ie tuple of (tx hash, tx index), we first need to resolve the tx hash -> tx id. This means querying 2 tables, tx and tx_out, instead of simply one. We do have a fairly efficient cache for the first query, however, including the tx_out.tx_hash means we can avoid it altogether.
This becomes more useful, since we want to use Postgres as a backstore for UTxO-HD, effectively reducing memory https://github.com/IntersectMBO/cardano-db-sync/issues/1684
There are many options: This field could replace tx_out.tx_id field, or it could be a new field or it could replace it based on some config option.
Unless this is about replacing tx table altogether, it brings little benefit (definitely not from performance pov, as looking at an indexed hash has higher resource cost than tx_id field which is well maintained index in a foreign table). More often than not - when querying tx input/output, we also fetch block details, which will need us to do that join anyways.
Secondly, it brings back the ugly behaviour of having duplicate hash strings in multiple tables + having duplicate hash strings within tx_out table, using up more storage and index space, for no performance gains
Unless this is about replacing tx table altogether, it brings little benefit
A clarification that the objective here is to reduce future queries that dbsync itself has to do, eventually reducing its memory footprint without a big hit on syncing speed. We need to investigate if the tx hash -> tx id cache that DBSync maintain is good enough or we need to avoid the query altogether.