cardano-db-sync
cardano-db-sync copied to clipboard
Indexes mismatch
There is a mismatch between the indexes created by 13.0.x and 13.1.0.x.
Release 13.1.0.0 brought many changes to the schema. One of them is the deletion of many unique keys. Also the indexes that db-sync creates were split in 2 stages of the migration: The 3rd stage has only a small number of indexes that db-sync itself needs. The 4th stage contains indexes that db-sync doesn't need but are still created for backwards compatibility. The 4th stage runs only when db-sync reaches the tip of the chain (or at the begining if --force-indexes
flag is enabled)
While the two releases create eventually the same indexes, one can see that the number of indexes is different with
psql> \di
13.0.5
returns 165
results, while 13.1.0.x
142
.
The reason for this difference is that postgres also created indexes for each unique key. Since unique keys are deleted, their indexes are no longer created by postgres. There are 23 unique keys deleted https://github.com/input-output-hk/cardano-db-sync/blob/1040fa9ec85fd75ce9f02dae2006170136793d02/schema/migration-2-0021-20221019.sql, which exactly matched the difference in indexes.
This difference in indexes may cause users to see big differences in the query speed and violates the promise that the two releases create the same indexes eventually, so it should be at least documented. Users can of-course extend the stage 4 migration files with any index they want.
In general, since db-sync is used by different clients with different query needs, it should be the responsibility of clients to add the indexes that they need. More information can be found https://github.com/input-output-hk/cardano-db-sync/blob/1040fa9ec85fd75ce9f02dae2006170136793d02/doc/migrations.md#upgrading-to-13100
Diffing the schemas I found 22 (23rd for extra_key_witnesses
shouldn't be there anymore due https://github.com/input-output-hk/cardano-db-sync/pull/1283) which can be re-createad with
CREATE UNIQUE INDEX IF NOT EXISTS unique_ada_pots ON ada_pots USING btree (block_id);
CREATE UNIQUE INDEX IF NOT EXISTS unique_col_txin ON collateral_tx_in USING btree (tx_in_id, tx_out_id, tx_out_index);
CREATE UNIQUE INDEX IF NOT EXISTS unique_col_txout ON collateral_tx_out USING btree (tx_id, index);
CREATE UNIQUE INDEX IF NOT EXISTS unique_delegation ON delegation USING btree (tx_id, cert_index);
CREATE UNIQUE INDEX IF NOT EXISTS unique_epoch_param ON epoch_param USING btree (epoch_no, block_id);
CREATE UNIQUE INDEX IF NOT EXISTS unique_ma_tx_mint ON ma_tx_mint USING btree (ident, tx_id);
CREATE UNIQUE INDEX IF NOT EXISTS unique_ma_tx_out ON ma_tx_out USING btree (ident, tx_out_id);
CREATE UNIQUE INDEX IF NOT EXISTS unique_param_proposal ON param_proposal USING btree (key, registered_tx_id);
CREATE UNIQUE INDEX IF NOT EXISTS unique_pool_owner ON pool_owner USING btree (addr_id, pool_update_id);
CREATE UNIQUE INDEX IF NOT EXISTS unique_pool_relay ON pool_relay USING btree (update_id, ipv4, ipv6, dns_name);
CREATE UNIQUE INDEX IF NOT EXISTS unique_pool_retiring ON pool_retire USING btree (announced_tx_id, cert_index);
CREATE UNIQUE INDEX IF NOT EXISTS unique_pool_update ON pool_update USING btree (registered_tx_id, cert_index);
CREATE UNIQUE INDEX IF NOT EXISTS unique_pot_transfer ON pot_transfer USING btree (tx_id, cert_index);
CREATE UNIQUE INDEX IF NOT EXISTS unique_redeemer ON redeemer USING btree (tx_id, purpose, index);
CREATE UNIQUE INDEX IF NOT EXISTS unique_ref_tx_in ON reference_tx_in USING btree (tx_in_id, tx_out_id, tx_out_index);
CREATE UNIQUE INDEX IF NOT EXISTS unique_reserves ON reserve USING btree (addr_id, tx_id, cert_index);
CREATE UNIQUE INDEX IF NOT EXISTS unique_stake_deregistration ON stake_deregistration USING btree (tx_id, cert_index);
CREATE UNIQUE INDEX IF NOT EXISTS unique_stake_registration ON stake_registration USING btree (tx_id, cert_index);
CREATE UNIQUE INDEX IF NOT EXISTS unique_treasury ON treasury USING btree (addr_id, tx_id, cert_index);
CREATE UNIQUE INDEX IF NOT EXISTS unique_tx_metadata ON tx_metadata USING btree (key, tx_id);
CREATE UNIQUE INDEX IF NOT EXISTS unique_txin ON tx_in USING btree (tx_out_id, tx_out_index);
CREATE UNIQUE INDEX IF NOT EXISTS unique_withdrawal ON withdrawal USING btree (addr_id, tx_id);
Why is there no index for
datum
?
CONSTRAINT unique_datum UNIQUE (hash)
is present in all versions
I think it would be useful to have a warning in the release notes pointing to this issue (for now there is just "Removed many unique keys that were never used").
I lost some significant time investigating performance issues.
Diffing the schemas I found 22 (23rd for
extra_key_witnesses
shouldn't be there anymore due #1283) which can be re-createad withCREATE UNIQUE INDEX IF NOT EXISTS unique_ada_pots ON ada_pots USING btree (block_id); CREATE UNIQUE INDEX IF NOT EXISTS unique_col_txin ON collateral_tx_in USING btree (tx_in_id, tx_out_id, tx_out_index); CREATE UNIQUE INDEX IF NOT EXISTS unique_col_txout ON collateral_tx_out USING btree (tx_id, index); CREATE UNIQUE INDEX IF NOT EXISTS unique_delegation ON delegation USING btree (tx_id, cert_index); CREATE UNIQUE INDEX IF NOT EXISTS unique_epoch_param ON epoch_param USING btree (epoch_no, block_id); CREATE UNIQUE INDEX IF NOT EXISTS unique_ma_tx_mint ON ma_tx_mint USING btree (ident, tx_id); CREATE UNIQUE INDEX IF NOT EXISTS unique_ma_tx_out ON ma_tx_out USING btree (ident, tx_out_id); CREATE UNIQUE INDEX IF NOT EXISTS unique_param_proposal ON param_proposal USING btree (key, registered_tx_id); CREATE UNIQUE INDEX IF NOT EXISTS unique_pool_owner ON pool_owner USING btree (addr_id, pool_update_id); CREATE UNIQUE INDEX IF NOT EXISTS unique_pool_relay ON pool_relay USING btree (update_id, ipv4, ipv6, dns_name); CREATE UNIQUE INDEX IF NOT EXISTS unique_pool_retiring ON pool_retire USING btree (announced_tx_id, cert_index); CREATE UNIQUE INDEX IF NOT EXISTS unique_pool_update ON pool_update USING btree (registered_tx_id, cert_index); CREATE UNIQUE INDEX IF NOT EXISTS unique_pot_transfer ON pot_transfer USING btree (tx_id, cert_index); CREATE UNIQUE INDEX IF NOT EXISTS unique_redeemer ON redeemer USING btree (tx_id, purpose, index); CREATE UNIQUE INDEX IF NOT EXISTS unique_ref_tx_in ON reference_tx_in USING btree (tx_in_id, tx_out_id, tx_out_index); CREATE UNIQUE INDEX IF NOT EXISTS unique_reserves ON reserve USING btree (addr_id, tx_id, cert_index); CREATE UNIQUE INDEX IF NOT EXISTS unique_stake_deregistration ON stake_deregistration USING btree (tx_id, cert_index); CREATE UNIQUE INDEX IF NOT EXISTS unique_stake_registration ON stake_registration USING btree (tx_id, cert_index); CREATE UNIQUE INDEX IF NOT EXISTS unique_treasury ON treasury USING btree (addr_id, tx_id, cert_index); CREATE UNIQUE INDEX IF NOT EXISTS unique_tx_metadata ON tx_metadata USING btree (key, tx_id); CREATE UNIQUE INDEX IF NOT EXISTS unique_txin ON tx_in USING btree (tx_out_id, tx_out_index); CREATE UNIQUE INDEX IF NOT EXISTS unique_withdrawal ON withdrawal USING btree (addr_id, tx_id);
these sql make sense with 13.1.0.2 version. the query performance improve a lot after creating indexes.
Wow removing all these indexes slowed down my whole app, many queries relied on these columns to be indexed, removing them was not the best idea
Wow removing all these indexes slowed down my whole app, many queries relied on these columns to be indexed, removing them was not the best idea
DBSync is currently used as a general purpose backend for Cardano applications. It's impossible to cover all indexes that users may want. Client should be responsible to monitor their queries, eg with EXPLAIN ANALYSE
, and add any indexes that they require. Additionally indexes use a lot of disk space, cause unnecesary delays and removing them has significantly increased syncing speed.
i agree, but it cam a bit offhand, should have been marked as breaking change