Horizon: Consider dropping duplicated columns from the database schema.

Open Shaptic opened this issue 4 years ago • 0 comments

What problem does your feature solve?

Some of the columns represent identical values and are artifacts from earlier migrations. Their names can also be misleading. Examples include:

history_ledgers.transaction_count == history_ledgers.successful_transaction_count
history_ledgers.operation_count is a subset of (the successful) history_ledgers.tx_set_operation_count

There may be others. This would trim a few hundred MBs (based on some napkin math) from a full-history database after a vacuum.

What would you like to see?

A migration that drops those columns from the database entirely.

What alternatives are there?

We could also drop and rename certain columns. For example,

rename operation_count -> successful_operation_count and rename tx_set_operation_count -> operation_count
if we apply the above, we should have the same pattern (track total & successful) for transactions, so do something like drop transaction_count, let transaction_count = successful_transaction_count + failed_transaction_count, and drop failed_transaction_count.

Another alternative is just... not doing this, which means an unnecessary +8 bytes of storage for every new ledger ingested, but that's basically a rounding error for enough history.

Jul 29 '21 17:07 Shaptic