cardano-db-sync icon indicating copy to clipboard operation
cardano-db-sync copied to clipboard

Future schema changes

Open kderme opened this issue 5 months ago • 4 comments

This ticket is WIP and a placeholder for planned schema changes. It may change and nothing is finalised Some changes are required for simplification and performance improvements. Ideally we will get user feedback for these. With release 14.0.0.0, some of them will break compatibility.

  • reward.earned_epoch, reward_rest.earned_epoch are dropped
  • tx_out.stake_address_id may be replaced by tx_out.stake_address, including the whole stake address.
  • References to redeemer will be dropped (from tx_in, delegation etc)

kderme avatar Jun 16 '25 15:06 kderme

tx_out.stake_address => that would have negative impact (similar reasoning as recently split out address table), moreso for stake_address as it's referred across a lot of tables.

rdlrt avatar Jun 16 '25 21:06 rdlrt

Thanks for the feedback.

The advantage of using a non-normalised tx_out is that it enables replaying the ledger rules directly in PostgreSQL, without requiring a separate UTxO storage.

Currently, db-sync maintains the UTxO set in two distinct ways:

  • In PostgreSQL, for external users
  • In memory, for internal ledger rule replay

With UTxO-HD, alternative backends like memory, LMDB, or LSM-tree can be used. Notably, the existing PostgreSQL schema could also serve as an additional backend option for db-sync.

kderme avatar Jun 16 '25 22:06 kderme

In addition, resolving the stake_address -> stake_addres_id for each output, requires queries from dbsync and adds syncing time.

kderme avatar Jun 16 '25 22:06 kderme

Sure, just weighing pros/cons, I am still unclear how would this work then for all the stake address references across tables (from query pov, the joins will still need to be made), eg: epoch_stake, delegations, rewards, etc? If for those cases, stake_address is going to be maintained regardless - wouldnt this cause duplication overhead?

Also, for tx_out - if stake_address is part of tx_out table, there will be duplication in indexes (index themselves would be much larger than ID fields) + redundant storage utilisation (tens of GBs) as same stake_address can in theory be referred infinite times.

rdlrt avatar Jun 16 '25 23:06 rdlrt