elixir-omg icon indicating copy to clipboard operation
elixir-omg copied to clipboard

Calculate metrics on lazily loaded UTXO set

Open pnowosie opened this issue 6 years ago • 6 comments

PR #1103 allows to load utxo subset when needed instead of loading it entirely during startup. This meas that in-memory UTXO set contains information processed since the last service restart.

Metrics collected so far:

Where we should move metrics calculation :question:

pnowosie avatar Nov 18 '19 14:11 pnowosie

PS: Maybe this really belongs to the informational watcher for product. But Child Chain really needs this visibility.

InoMurko avatar Dec 18 '19 16:12 InoMurko

  1. There's two set of metrics we're interested in:
  • Product metrics related to UTXOs (how utilized are we, what are the amounts going in and out)
  • balance and unique addresses really belong to informational watcher (SQL)
  1. Technical engineering perspective (separate issue TBD)
  • Round trip measurements, size of the state set and the memory implications etc.

InoMurko avatar Dec 18 '19 16:12 InoMurko

I'm tackling this ticket from the following angles:

  • :authority_balance
    • Stays in childchain since it's primarily the operator's concern to topup the balance.
    • Currently reported only when there's a block submission. This means the monitors can't differentiate between no data vs broken reporting.
    • Change to report periodically, e.g. every 5 minutes.
    • Update: This one is not urgent since the :authority_balance should only decrease on a block submission anyway
  • :balance per token
    • Stays in childchain so the operator can be alarmed on the network's insolvency.
    • Add to watcher and watcher_info so the integrator/user can be alarmed on the network's insolvency.
    • How to aggregate the not-loaded utxos without significant performance/resource impact?
      • Maybe async task spawned at app start up that fetches and aggregates the not-loaded utxos in batches.
  • :unique_users
    • Move to watcher_info, good to have info for business insights but not network's healthiness.
    • Since it's only needed in watcher_info, we can populate and aggregate the info from the informational database.

unnawut avatar Jul 21 '20 05:07 unnawut

[more like for discussion, probably not too much suitable for the task]

If there are multiple metrics, we might consider to invest some time to see the possibility of stream on the DB changes.

One design I used to see and really liked was: Service -> DB -> db change stream -> probably some computing job change format or such -> logs, business DB....etc

This gives really clear boundary and the performance impact is sort of limited. However, what I see was also fully cloud support feature to enable that 😅 the dynamoDB I used to use has the streaming feature which is amazing to use.....I think if we do this way we would need to made our own stream mechanism on our DBs (rocksDB, postgres <-- I guess there is potential postgres has similar feature but not sure)

boolafish avatar Jul 21 '20 05:07 boolafish

[more like for discussion, probably not too much suitable for the task]

If there are multiple metrics, we might consider to invest some time to see the possibility of stream on the DB changes.

One design I used to see and really liked was: Service -> DB -> db change stream -> probably some computing job change format or such -> logs, business DB....etc

This gives really clear boundary and the performance impact is sort of limited. However, what I see was also fully cloud support feature to enable that 😅 the dynamoDB I used to use has the streaming feature which is amazing to use.....I think if we do this way we would need to made our own stream mechanism on our DBs (rocksDB, postgres <-- I guess there is potential postgres has similar feature but not sure)

Moving/noting this down into https://github.com/omgnetwork/private-issues/issues/66

unnawut avatar Jul 21 '20 08:07 unnawut

After more thoughts, fixing the balance is not easy so:

  • :authority_balance -> fix later, it reports accurately but just not reporting frequent enough
  • :unique_users -> fix later, doesn't impact network health
  • :balance per token -> working on this, needed to monitor network's solvency

unnawut avatar Jul 21 '20 11:07 unnawut