alkemio icon indicating copy to clipboard operation
alkemio copied to clipboard

Client Errors - Improve observability and quality

Open bobbykolev opened this issue 10 months ago • 5 comments

Description

We need better observability of the client enabling us to track and fix user issues, set quality KPIs, and better categorize issues.

Goal

This initiative aims to positively impact our users' experience. Better detection and tracking of critical issues would lead to quicker focus and resolution. Setting quality KPIs will give visibility to the stakeholders of the client. The critical client errors should be resolved with priority.

Hypothesis

By utilizing and improving the Sentry integration (3rd party tool) and APM, we can better track and organize client errors. Once we better organize the errors and fix the critical ones, we can define specific KPI targets and set a monitoring schedule.

Must have scope

Structure the process around client observability:

[ ] Revise and improve Sentry logging. [ ] Revise APM, how we can use it in isolation or combination with Sentry. [ ] Set reasonable crash-free, performance, and unhandled errors KPIs. [ ] Set alerting, monitoring, and ownership (on critical errors, new errors, post-release, etc.).

Analysis of the issues being experienced, with recommendations of issues to be addressed:

[ ] Tag/categorize client errors (by severity and domain). [ ] Log the current critical errors. [ ] Log and fix or categorize the most common errors (to reduce the noise).

Optional:

[ ] Research what other features of Sentry could be utilized - Metrics, Replays, etc. - to better track user experience;

Next: [ ] Next epic planned in with heavier issues.

Here's a link to the initial challenge.

Stakeholders

@techsmyth @valentinyanakiev @me-andre @hero101 @Comoque1 @bobbykolev

bobbykolev avatar Mar 31 '24 08:03 bobbykolev

Thanks for opening your first issue here! Be sure to follow the issue template!

welcome[bot] avatar Mar 31 '24 08:03 welcome[bot]

We already have implementation for Level (fatal, error, warning, etc) in Sentry/log.ts. We could extend the same implementation to enhance the context with tags: https://docs.sentry.io/platforms/javascript/guides/react/enriching-events/tags/ This could help us detect functionality like Auth/Server/Callout etc. or other useful information that could help us track an error faster.

bobbykolev avatar Mar 31 '24 21:03 bobbykolev

The following epic duplicates this one: https://app.zenhub.com/workspaces/alkemio-development-5ecb98b262ebd9f4aec4194c/issues/gh/alkem-io/alkemio/1291

bobbykolev avatar Apr 12 '24 11:04 bobbykolev

@bobbykolev good catch, can you please merge the other epic into this one? So both description / information and also issues (if any / relevant still).

techsmyth avatar Apr 13 '24 16:04 techsmyth

Done.

bobbykolev avatar Apr 18 '24 07:04 bobbykolev