graylog2-server icon indicating copy to clipboard operation
graylog2-server copied to clipboard

Data Node: Improve error visibility

Open mpfz0r opened this issue 2 years ago • 1 comments

Setting up the data node should help the user to identify errors and give them a chance to fix them.

What?

While setting up a data node, I ran into a common problem of having another process already running on port 9200. The logs of the data-node show BindHttpException[Failed to bind to 0.0.0.0:9200]; nested: BindException[Address already in use]

However, the UI gives no hint of what is actually happening, or how to recover from that error:

image

It would be great if we could show the error logs in the UI, or give the user an idea how he can retry the deployment.

Your Environment

  • Graylog Version: 5.2.0-beta.3

mpfz0r avatar Oct 18 '23 14:10 mpfz0r

Currently, the errors shown are only from the provisioning process. The OpenSearch startup happens as a totally independent process, and errors during OpenSearch startup can also happen during regular (later) starts of the DataNode and should not be written into the preflight data structures then. And not all exceptions in the OpenSearch logs are actually errors that break the start of OpenSearch. So it looks to be a bit more complicated to find a solution here that exceeds showing In case of errors, check the DataNode logs.

janheise avatar Oct 26 '23 10:10 janheise

Error visibility was improved in other PRs and also we are showing logs now.

gally47 avatar May 13 '24 09:05 gally47