open-forms icon indicating copy to clipboard operation
open-forms copied to clipboard

Detect more problems and add them to the email digest

Open sergei-maertens opened this issue 1 year ago • 5 comments

Target audience: functional administrators Goal: avoid things breaking silently (things will break)

Things that go wrong during runtime

  • [x] All submissions with failed registration status (or a handy admin link to filter those)

    • check TimelineLog from past 24 hours
    • aggregate on form title/name + number of failed submissions + clickable link to submissions admin with preconfigured filters, e.g. /admin/submissions/submission?form=123&state=failed&from=24hAgo
    • (add submission filter for 'past 24 hours')
    • when was first & last failure in the past 24 hours
    • example text in email: Aanvraag subsidie registratie gefaald: 3 times between 14:03 and 15:24 in the past 24 hours
  • [x] Log failures (e.g. from prefill) (e.g. create log record that we can filter on) etc. & include how many failed each day (and aggregate problem)

See also #3999 which may overlap for (Prometheus) metrics.

  • The page to view these relevant log records should include/display sufficient context, like which form(s) are affected, timestamps..., type of error (maybe)?
  • Pointers to where people can look to investigate
    • logs
    • tip to enable request logging?

Things that can go wrong/cause support overhead because of broken configuration

  • [x] If addressNL components are used in (active) forms, then the BRK client must be configured properly. If this is not the case, include it as a problem that needs attention. (from github PR review comment: )
  • [x] Deriving street/city from postal code/number without configuring the BAG API
  • [x] Are forms still valid or not?
    • Registration backend configuration
    • Broken logic rules (fields that no longer exist)
    • ~~Deprecated components~~ (in screen warning better?)
  • [x] (zgw consumers) services / configuration overview -> run test periodically & include result?
    • certificate checks (inspecting error responses + educated guesses/suggestions)
    • warn of certificate expiries (when <= 2 weeks to go?)
    • host connection / credentials
    • configuration complete or not

sergei-maertens avatar Dec 29 '23 10:12 sergei-maertens