Query: precompute various statistics for past hour, day, week, month, year, etc.

Open bitjson opened this issue 4 years ago • 0 comments

For chain analysis and block explorer use cases, it would be valuable to settle on a strategy for pre-computing common aggregate statistics. (It's always possible to use aggregation functions with the admin role, but some strategy for enabling fast, public access to common aggregations would be valuable.)

One option would be to use Hasura's cron triggers to refresh a materialized view (already possible, but linked issue is about making it simpler).

It would probably be valuable to have at least two materialized views – one for intraday statistics (e.g. past block, hour, 2 hours, 6 hours, 12 hours, day) which is relatively cheap to refresh and one for longer statistics (past week, 2 weeks, month, 3 months, 6 months, year, 3 years, all time) which may several minutes to refresh. Intraday statistics could probably be regenerated after every block using a trigger on block inserts; longterm statistics could be generated every 12 or 24 hours.

Useful statistics would include:

total transaction count
UTXO set change (output count - input count)
unique locking_bytecode (address) used count
total, average, and 10th, 25th, 50th, 75th, and 90th percentile transaction value
total, average, and 10th, 25th, 50th, 75th, and 90th percentile transaction byte size
count of transactions with output value greater than 0.01, 0.1, 1, 10, 100, 1,000, 10,000
total fees paid
coins generated (inflation)
coin days destroyed (#31)
active supply (total_circulation - inactive_balances_over_period)
P2PKH vs. P2SH distribution using locking_bytecode_pattern (maybe p2pkh_percent?)
- For BCH: distribution of P2PKH inputs using schnorr vs. DER signatures (possible using unlocking_bytecode_pattern; distribution within multisig P2SH would even be possible with redeem_bytecode_pattern)
approx. mining revenue for 1 terahash/s

Since some of these stats would still be too expensive to compute without #29 enabled on the columns in question, only a subset of the cheaper ones should be enabled by default, and the others should probably just be configurable.

This would also be valuable for #10.

Nov 23 '21 17:11 bitjson