dashboard icon indicating copy to clipboard operation
dashboard copied to clipboard

Expose Cluster Operations Events/Logs

Open vlerenc opened this issue 7 years ago • 7 comments

Story

  • As operator I want to access cluster operations logs (create, reconcile, update, delete?) in a convenient way, so that I can easily see what the outcome of each operation was without trying to grep the information out of the overall/global Gardener logs.

Motivation

We like to improve the way how we interact/access the Gardener/cluster logs. As an operator (of the week) I frequently have to check what the Gardener says/logs on a certain cluster and operation. Now with the reconciler, the logs grew and with all the other planned features, they will even grow more. Usually, we need to know something specific about a particular cluster, for a particular operation at a particular date/time (gardener/gardener#49). A central logging stack and UI (e.g. in the Garden cluster) will help me, but we can only expose it to project members if it supports multi-tenancy.

Acceptance Criteria

  • [ ] Show Gardener operations (and their status/results) per cluster (shoot resource event log)
  • [ ] Show link to logging UI (Kibana) that would display all logs for a particular Gardener operation

Implementation Proposal

We could use the Kubify-deployed logging stack and instead of showing the logs for an operation directly in the dashboard, the dashboard could generate a query for said logging stack that shows the right logs in the logging UI. Of course, as long as the logging stack doesn't support multi-tenancy, this would mean that the feature would be only available to the core team/admins, not to project members. This would be acceptable. The primary goal is to help our operators (of the week) and that solution would fully serve that purpose.

Definition of Done

  • [ ] Knowledge is distributed: Have you spread your knowledge in pair programming/code review?
  • [ ] Unit Tests are provided: Have you written automated unit tests or added manual NGPTT tickets?
  • [ ] Integration Tests are provided: Have you written automated integration tests?
  • [ ] Minimum API exposure: If you have added public API, was it really necessary/is it minimal?
  • [ ] Operations guide: Have you updated the operations guide?

vlerenc avatar Feb 12 '18 18:02 vlerenc

We will leverage the terminal feature to achieve this. Also expose cluster insights like deployments, worker, pods, etc.

grolu avatar Dec 05 '19 08:12 grolu

See also #8 - We want to allow the user to configure custom views to show custom resources, views etc. via the terminal feature on the details screen.

grolu avatar Dec 05 '19 09:12 grolu

Isn't here the main problem, how to get to the cluster operation logs in the first place @grolu , cc @rfranzke?

vlerenc avatar Aug 13 '20 20:08 vlerenc

Yes. And actually I moved all issues of this kind to in review. As in my point of view the terminal shortcuts feature (https://github.com/gardener/dashboard/pull/739) can be at least a temporary way to solve this @petersutter

grolu avatar Aug 14 '20 07:08 grolu

Well it depends on how this information is fetched / from where. @rfranzke is there an example anywhere?

petersutter avatar Aug 18 '20 14:08 petersutter

Can you explain again? What do you want to fetch? What are cluster operation logs?

rfranzke avatar Aug 19 '20 06:08 rfranzke

I think, this is a misunderstanding. This is a very old story or rather epic and it is most likely nothing that can be "just implemented". It will need planning and possibly more side changes.

So let's first get back to the goal:

  • Show Gardener operations (and their status/results) per cluster (shoot resource event log)

That can be a creation, upgrade, reconciliation/maintenance operation. So the first question would be whether that's possible. Second how. Today, we have no longer GRM doing everything. Initiation happens there, but the rest is happening in the gardenlet, but that aside:

  • Is it useful to present this information to the end users (helpful)? The assumption is yes. End users were also asking for that?
  • Are we sure, it doesn't contain sensitive information?
  • If not, what do we have to do in terms of cleanup?
  • Can we find it easily per cluster (by operation ID or something)?
  • How can we extract this data so that we can show it to end users?
  • If we can't, can we show a minimal log, e.g. just an overview which operations were executed and their result w/o any logs?

The log viewer idea (for the detailed logs) in the description is somewhat troubled, because end users should not be able to see shared resource logs like those of GRM or gardenlet, so is there another way?

You see, many open questions, because we haven't designed that into the solution from the beginning. We talked however about it and have done something with events, but then ran into TTL issues/discussed the matter, but I don't think conclusively, right?

vlerenc avatar Sep 03 '20 06:09 vlerenc