ecosystem icon indicating copy to clipboard operation
ecosystem copied to clipboard

Add Observability to Faucet

Open fainashalts opened this issue 1 year ago • 2 comments

Add observability for status of faucet requests to get a sense for how often folks encounter failures. Most likely should use the existing Grafana dashboard. Once this task is complete, we should add a pageable event at a certain threshold (we can experiment with the right level) for oncall to ensure we are alerted.

fainashalts avatar Jul 31 '24 22:07 fainashalts

Created a mixpanel dashboard for tracking usage of faucet on dev console: https://mixpanel.com/s/2Y71kt

Based on the usage I am seeing it doesnt seem like the captcha has had any effect on the number of drips. I still suspect there are people botting this based on the number of users using Privy auth and also the fact that so many users are from one region:

Image

tremarkley avatar Aug 14 '24 22:08 tremarkley

Started working on updating the grafana dashboard. The "Count of faucet drip failures" panel has been updated to be pointed at the faucet failures on dev console.

The drippie contract balance panel is pointed to the right contract.

The L1 Sepolia Faucet Contract and L1 Sepolia Faucet Admin balance panels are still pointed to the old faucet, but should automatically update to the new faucet once this merges and the gateway pods are updated: https://github.com/ethereum-optimism/k8s/pull/4283

Next steps would be to add some alerting so that we get alerted when drip failures increase.

tremarkley avatar Aug 15 '24 01:08 tremarkley

Created 4 grafana alerts for the faucet:

  • L1 Sepolia: Faucet Admin Wallet Balance
    • Triggered when admin wallet drops below 0.1 eth
  • L1 Sepolia: Faucet Contract Balance
    • Triggered when faucet balance drops below 10 eth
  • L1 Sepolia: Faucet Drippie Contract Balance
    • Triggered when drippie contract drops below 1000 eth
  • Spike in faucet claim tx failures
    • Triggered when > 20 faucet drip txs have failed in the last hour

For now these alerts will ping in the #pod-devx slack channel. Next step is getting the alerts integrated into ops genie so that the alerts get sent to the on call engineer.

If you're curious, the alerts can be found here: https://optimistic.grafana.net/alerting/list?search=dashboard:ebc95cfa-3368-410f-9d72-ca240f4e2831

tremarkley avatar Sep 06 '24 23:09 tremarkley