indexer icon indicating copy to clipboard operation
indexer copied to clipboard

Indexer could use smaller hash indexes instead of btree indexes

Open urtho opened this issue 3 years ago • 1 comments

Problem

All indexer indices are btree but not all have to be.

Solution

Replace btree indexes with hash indexes in places where only exact searches are used and the resulting hash index is significantly smaller.

One such example could be :

ledgerdb=# create index concurrently txn_by_tixid_hash on txn using hash (txid);
CREATE INDEX

ledgerdb=# \di+
                                         List of relations
 Schema |             Name              | Type  | Owner |       Table       |  Size   | Description
--------+-------------------------------+-------+-------+-------------------+---------+-------------
 public | txn_by_tixid                  | index | algo  | txn               | 43 GB   |
 public | txn_by_tixid_hash             | index | algo  | txn               | 16 GB   |

Urgency

Performance optimization.

urtho avatar Feb 21 '22 12:02 urtho

Hash or nonclustered may be good for mixed data sets, if clustering/ordering lookups on common keys are not needed: https://postgrespro.com/blog/pgsql/4161321

Do you have an example proc, table and index to look into?

I am not up to speed on postgres, but in general if page or row or columnstore compression can be leveraged, that would also reduce disk utilization depending on the index type.

If summarized results are needed clustered columnstore type indexing may offer more compression for summary performance but poor insert/update maintenance, although for large data warehouses they can also help compress wide complex data sets.

oroechimaru avatar Mar 16 '22 03:03 oroechimaru