collector-sidecar icon indicating copy to clipboard operation
collector-sidecar copied to clipboard

Include failure reason in Sidecar Collectors UI, improve failure tracking for large deployments

Open will-graylog opened this issue 3 years ago • 3 comments

Problem description

  • What: Include extra details in Sidecar UI on why Sidecar Collectors are failing. Such extra info is already provided via the /api/sidecars API in the verbose_message response field.
    • Also, having a single view that shows a list of all current collector failures & failure reasons would be highly beneficial. Perhaps something similar to the "System messages" list in the System > Overview page.
  • Where: Graylog Sidecar Collectors UI
  • Why: Being able to see this additional and very useful information on why sidecar collectors are failing decreases troubleshooting time and helps customers get sidecars back online faster. Especially for Graylog end users who only have access to the UI and not the API.
    • Also, having a separate list view of all collector failures & failure reasons makes troubleshooting much faster especially for customers with large sidecar deployments.

example

Environment

  • Sidecar Version: 1.1.0
  • Graylog Version: 4.2.10
  • Operating System: n/a
  • Elasticsearch Version: n/a
  • MongoDB Version: n/a

ref: [HS-974805117]

will-graylog avatar Jul 27 '22 16:07 will-graylog

@will-graylog if you are looking under the collector overview, and click on the collector name,

you can see the details here: image

Refs https://github.com/Graylog2/graylog2-server/pull/5208

mpfz0r avatar Aug 16 '22 14:08 mpfz0r

@will-graylog @mikedklein did you see my comment above? Can we close this?

mpfz0r avatar Aug 22 '22 15:08 mpfz0r

@mpfz0r Let me check with my customer and I'll let you know if this is sufficient.

ghost avatar Aug 24 '22 14:08 ghost

Hey @mpfz0r so sorry about the delay in response, but I just got back with the customer who says this is insufficient at the scale they are using sidecars (over 3,000 collectors in 1,500+ sidecar agents).

Is there a way we can create a table with all sidecars' statuses, error messages, and verbose debug info if applicable? Because I do agree with out customer that having to click through each sidecar to get to the specific collector's error message is quite tedious when using as many collectors as they are.

ghost avatar Dec 05 '22 16:12 ghost

@will-graylog I agree we should improve that. I'll pitch that to product

mpfz0r avatar Dec 15 '22 09:12 mpfz0r