fabric8-wit
fabric8-wit copied to clipboard
Sentry shouldn't group unrelated errors together
According to https://docs.sentry.io/learn/rollups/#grouping-by-stacktrace page, all errors going to sentry are grouped according to stacktrace (if present)
. The doc says
If a stacktrace or exception is involved in a report, then grouping will only consider this information.
Most of the errors thrown from the backend contain this minimal stacktrace
File "/tmp/go/src/github.com/fabric8-services/fabric8-wit/sentry/sentry.go", line 101, in func1
File "/tmp/go/src/github.com/fabric8-services/fabric8-wit/sentry/sentry.go", line 75, in loop
due to https://github.com/fabric8-services/fabric8-wit/blob/59561d074d5bc6574994b4821db0bff0278fa010/sentry/sentry.go#L101
Since all events have similar stacktrace they are all grouped together into one issue and it makes it very difficult to search for errors. Look at https://errortracking.prod-preview.openshift.io/openshift_io/fabric8-wit/issues/2083/events/ . It contains errors like
- errors.InternalError: could not delete space
- errors.InternalError: space UUID 51b37ab5-81cd-424d-90d7-3327860f0d64 is not valid space name
- errors.InternalError: namespace with id 'user' not found
This can be fixed by either one of the following
-
Improving the sentry client to use interface (interface) so that the events can be distinguished - This one seems to be the most feasible option and it would require minimal effort.
-
Refactoring all errors so that instead of just returning the errors from the controller we return error along with the stacktrace using
errs.WithStack(....)
- I don't think we want this because it doesn't make sense to return the entire stacktrace to the user. -
Moving the sentry logging out of the
jsonapi.JSONErrorResponse()
into each controller so that every error can be logged/handled separately - This might solve the problem but this would mean we have to addsentryClient.CaptureError(...)
to all error handling block. This doesn't seem feasible to me. -
Removing stacktrace from the error if it didn't have one originally - The sentry client somehow captures these two calls in the stacktrace
File "/tmp/go/src/github.com/fabric8-services/fabric8-wit/sentry/sentry.go", line 101, in func1 File "/tmp/go/src/github.com/fabric8-services/fabric8-wit/sentry/sentry.go", line 75, in loop
if we could get rid of these then all the errors would be grouped according to the error message which would make much more sense.
I think option 1 would be the best way because it would allow us to send custom data (like controller name) to sentry.
What do you think @kwk @xcoulon @aslakknutsen ?