kibana icon indicating copy to clipboard operation
kibana copied to clipboard

[Alerting] Change logger level to debug when delete a rule without alerts produce an error or warn when untracking alerts

Open cnasikas opened this issue 1 year ago • 3 comments

Summary

Fixes: https://github.com/elastic/kibana/issues/182754

For maintainers

cnasikas avatar May 11 '24 09:05 cnasikas

Pinging @elastic/response-ops (Team:ResponseOps)

elasticmachine avatar May 11 '24 09:05 elasticmachine

@elasticmachine merge upstream

cnasikas avatar May 12 '24 09:05 cnasikas

@elasticmachine merge upstream

cnasikas avatar May 13 '24 07:05 cnasikas

Hmmm, perhaps my suggestion of changing the error/warn to debug was not the best :-)

It looks like this would end up setting the level for any errors when alerts are updated. My main issue was having messages when there aren't any alerts to update.

Peeking at the code, I found this:

https://github.com/elastic/kibana/blob/551e1232b870654d036aff4d52f43443ee3ecf20/x-pack/plugins/alerting/server/alerts_service/lib/set_alerts_to_untracked.ts#L226-L228

Can we just not throw an error here? Or maybe that is indicative of some kind of error, in some case through the loop and with that query? But it is definitely the case that when a rule is deleted / disabled and has no alerts, these logs are generated - which is what I want to avoid.

pmuellr avatar May 13 '24 12:05 pmuellr

Good point @pmuellr! I missed that. I reverted the log levels and stopped throwing an error when alerts were not found in e0c65fe (#183207). I decided to return an empty array if there are no results. I checked where the function is being called and it seems safe to do so. The only place where the response is used is at https://github.com/elastic/kibana/blob/e76c5edb2a80c14f91b82bf02f29c6d9175ce5e8/x-pack/plugins/alerting/server/application/rule/methods/bulk_untrack/bulk_untrack_alerts.ts#L67. Let me know if that's ok.

cnasikas avatar May 14 '24 13:05 cnasikas

I reverted the log levels and stopped throwing an error when alerts were not found in e0c65fe (#183207). I decided to return an empty array if there are no results. I checked where the function is being called and it seems safe to do so. The only place where the response is used is at

https://github.com/elastic/kibana/blob/e76c5edb2a80c14f91b82bf02f29c6d9175ce5e8/x-pack/plugins/alerting/server/application/rule/methods/bulk_untrack/bulk_untrack_alerts.ts#L67

. Let me know if that's ok.

That seems like the right thing to do. Following that bulk untrack module, it seems like it might actually try to make some kind of bulk call later (to clear out some alerts in task state docs) with an empty array. Starting here: https://github.com/elastic/kibana/blob/71f856b527344f1bacb6240081403aeeb9a060da/x-pack/plugins/alerting/server/application/rule/methods/bulk_untrack/bulk_untrack_alerts.ts#L67-L72

Not sure how ES will handle a bulk update with no actual updates :-), or perhaps it's short-circuited elsewhere. Seems like deleting a rule which never had an active alert will tell us. I have a feeling we may need to short-circuit that ourselves, if the number of alerts is 0 ...

pmuellr avatar May 14 '24 14:05 pmuellr

I have a feeling we may need to short-circuit that ourselves, if the number of alerts is 0 ...

I was also skeptical about it but wanted your opinion before doing it 🙂. Thanks! if we return early like

 if (taskIds.length === 0) { <---- should we context.auditLogger?.log here???
      return;
    }
    
    await context.taskManager.bulkUpdateState(taskIds, (state, id) => { //// })
    
    context.auditLogger?.log(
      ruleAuditEvent({
        action: RuleAuditAction.UNTRACK_ALERT,
        outcome: 'success',
      })
    );

should we also audit log when we return without calling context.taskManager.bulkUpdateState?

cnasikas avatar May 14 '24 14:05 cnasikas

should we also audit log when we return without calling context.taskManager.bulkUpdateState?

Great question - should we log that we did nothing? :-) I guess I'm thinking we should. If we don't, then the pattern of audit logs would look no different than if something went wrong and that log doc didn't get written. So, it might look like an error to someone, to not see that. @mikecote thoughts?

pmuellr avatar May 14 '24 16:05 pmuellr

Great question - should we log that we did nothing? :-) I guess I'm thinking we should. If we don't, then the pattern of audit logs would look no different than if something went wrong and that log doc didn't get written. So, it might look like an error to someone, to not see that. @mikecote thoughts?

Yes I think so, it matches patterns like enabling an already enabled rule and such (audit log events are still logged).

mikecote avatar May 14 '24 16:05 mikecote

@pmuellr @mikecote Done in 45bef8e (#183207)

cnasikas avatar May 15 '24 08:05 cnasikas

:green_heart: Build Succeeded

Metrics [docs]

✅ unchanged

History

  • :green_heart: Build #209858 succeeded da2e343da989370378b42d3fa70782f0cc981ffa
  • :green_heart: Build #209423 succeeded 45adbe32b78f5e1ffbddc1b19fe955e19d013193
  • :broken_heart: Build #209394 failed 38d3c87f1aa7e6078f3a2a56b5dd318b0de69a24
  • :broken_heart: Build #209360 failed 9674a0fbfc8d7803f3437517cebdb716a0b48ec8
  • :broken_heart: Build #209345 failed 3d8c53bb73375bc2a9d5f89463f73ced5e7e4465

To update your PR or re-run it, just comment with: @elasticmachine merge upstream

cc @cnasikas

kibana-ci avatar May 15 '24 09:05 kibana-ci