temboard icon indicating copy to clipboard operation
temboard copied to clipboard

Metrics not being collected after client host reboot

Open dgiffin opened this issue 2 years ago • 19 comments

Version 7.6

We had a host crash/reboot, PG and the temboard-agent came up fine as expected but in the temboard dashboard I can't see any metrics displayed since the restart. I can't see any obvious errors in either the temboard-agent or the temboard logs. Any suggestions as to how I can troubleshoot this?

dgiffin avatar Nov 29 '21 11:11 dgiffin

Hi @dgiffin . Do you have something in temBoard UI logs ?

bersace avatar Nov 29 '21 14:11 bersace

Can you provide an email address to send logs to for review?

bsislow avatar Nov 29 '21 15:11 bsislow

@bsislow yep : etienne DOT bersac AT dalibo DOT com

bersace avatar Nov 29 '21 15:11 bersace

Thanks Etienne, I've emailed the logs.

dgiffin avatar Nov 30 '21 09:11 dgiffin

Any update? Should we reinstall the agent?

bsislow avatar Dec 06 '21 15:12 bsislow

@bsislow yep, i don't see any errors, that's odd and need further investigation I can't remotely.

bersace avatar Dec 07 '21 07:12 bersace

I've removed the agent and redeployed it... same issue. Please advise, this is urgent...

bsislow avatar Dec 07 '21 14:12 bsislow

Examples of the issue: image image image

bsislow avatar Dec 07 '21 14:12 bsislow

Do you have errors in web console of the browser ?

bersace avatar Dec 07 '21 14:12 bersace

No, none. Do you need a new log sent to your email?

bsislow avatar Dec 07 '21 14:12 bsislow

I am seeing this in the temboard (not agent) logs now. When I removed the agent, I purged (/usr/share/temboard-agent/purge.sh) it and also deleted it (Instances) from the dashboard.

2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: Unhandled Error:
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: Traceback (most recent call last):
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR:   File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 227, in error_middleware
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR:     return func(request, *args)
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR:   File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 408, in user_instance_middleware
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR:     return func(request, address, port, *args)
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR:   File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 267, in instance_middleware
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR:     return callable_(request, *args)
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR:   File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 431, in user_middleware
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR:     return func(request, *args)
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR:   File "/usr/lib/python2.7/site-packages/temboardui/plugins/monitoring/handlers/alerting.py", line 48, in alerts
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR:     host_id, instance_id = get_request_ids(request)
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR:   File "/usr/lib/python2.7/site-packages/temboardui/plugins/monitoring/tools.py", line 93, in get_request_ids
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR:     host_id = get_host_id(request.db_session, request.instance.hostname)
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR:   File "/usr/lib/python2.7/site-packages/temboardui/plugins/monitoring/tools.py", line 75, in get_host_id
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR:     "Could not find registered host with hostname=%s" % hostname
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: Exception: Could not find registered host with hostname=************
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: Unhandled Error:
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: Traceback (most recent call last):
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR:   File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 227, in error_middleware
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR:     return func(request, *args)
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR:   File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 408, in user_instance_middleware
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR:     return func(request, address, port, *args)
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR:   File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 267, in instance_middleware
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR:     return callable_(request, *args)
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR:   File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 431, in user_middleware
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR:     return func(request, *args)
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR:   File "/usr/lib/python2.7/site-packages/temboardui/plugins/monitoring/handlers/alerting.py", line 91, in checks
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR:     host_id, instance_id = get_request_ids(request)
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR:   File "/usr/lib/python2.7/site-packages/temboardui/plugins/monitoring/tools.py", line 93, in get_request_ids
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR:     host_id = get_host_id(request.db_session, request.instance.hostname)
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR:   File "/usr/lib/python2.7/site-packages/temboardui/plugins/monitoring/tools.py", line 75, in get_host_id
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR:     "Could not find registered host with hostname=%s" % hostname
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: Exception: Could not find registered host with hostname=************
2021-12-07 08:52:28,282 temboardui[18629]: [web] ERROR: Request failed: 500 Could not find registered host with hostname=************.
2021-12-07 08:52:28,282 temboardui[18629]: [web] ERROR: Request failed: 500 Could not find registered host with hostname=************.

bsislow avatar Dec 07 '21 14:12 bsislow

Exception: Could not find registered host with hostname=************

This errors means the monitoring plugin in UI never got data for this host.

Can you send me debug log of agent, for about 20 minutes of activity at least ? I don't see any monitoring task scheduled in the debug log you sent me.

bersace avatar Dec 07 '21 15:12 bersace

Emailed.

bsislow avatar Dec 07 '21 15:12 bsislow

FYI - this is now happening on another agent host; no data after reboot. I will get back to you shortly with UI logs via email...

bsislow avatar Dec 09 '21 19:12 bsislow

Newest logs sent with DEBUG for UI.

Also, I restarted the temboard UI to change the logging from INFO to DEBUG and that miraculously started populating monitoring charts...

For example (6 hours): image

However, and this server has been around for several years, 30 days' of history is not available for example: image

All charts look like this for 30 days... (No data available).

bsislow avatar Dec 10 '21 16:12 bsislow

@bsislow, happy to know it's back. Could you keep logs verbose so as to see what happen if the issue reappear ? Can you try to reproduce the root of the error like killing an agent or restarting it ?

bersace avatar Dec 13 '21 14:12 bersace

I will turn DEBUG back on for the UI.

We will try to reproduce this.

bsislow avatar Dec 13 '21 15:12 bsislow

Hi @bsislow did you reproduce this with 7.10 ?

bersace avatar Mar 02 '22 09:03 bersace

We will have to see; we have yet to reboot a host :)

bsislow avatar Mar 02 '22 14:03 bsislow

Closing stall issue. Please reopen if news.

bersace avatar Oct 09 '23 12:10 bersace