cardano-db-sync icon indicating copy to clipboard operation
cardano-db-sync copied to clipboard

Investigate using BRIN indexes

Open kderme opened this issue 2 years ago • 4 comments

BRIN are a special case of postgres indexes https://www.postgresql.org/docs/current/brin-intro.html. They seem to better fit the workload of db-sync

BRIN is designed for handling very large tables in which certain columns have some natural correlation with their 
physical location within the table.

Many of the fields in the db are non decreasing. This could result in faster syncing, smaller disk usage. Queries could also be affected It requires adequate benchmarks both for syncing and querying.

Depends on https://github.com/input-output-hk/cardano-db-sync/issues/1087 as many of the existing indexes will be deleted

kderme avatar Oct 30 '22 14:10 kderme

This will reduce disk usage by several GBs (10 or more likely). May improve sync time, but not certain. May impact queries, could improve them but could slow them. Ideally we'll need to benchmark first.

marshada avatar Oct 31 '22 20:10 marshada

Queries seem to slow down as a result of using these Indexes. Even by setting the pages_per_range to 1, queries are slower compare to btree indexes.

I haven't benchmarked the sync speed for them, but since many indexes are removed with https://github.com/input-output-hk/cardano-db-sync/issues/1087, it's not expected to make a difference

So the main benefit is disk space.

kderme avatar Nov 04 '22 13:11 kderme

In my current use case, query time is the choke point.

mgajda avatar Feb 06 '23 21:02 mgajda

In my current use case, query time is the choke point.

Without supplying your query, this assertion does not carry much weight. It is trivially easy to write queries with very poor performance. In many instances these bad/slow queries can be rewritten to be significantly faster (I had one that was over 100x faster after a think and a rewrite).

erikd avatar Feb 07 '23 02:02 erikd