temboard
temboard copied to clipboard
Metrics not being collected after client host reboot
Version 7.6
We had a host crash/reboot, PG and the temboard-agent came up fine as expected but in the temboard dashboard I can't see any metrics displayed since the restart. I can't see any obvious errors in either the temboard-agent or the temboard logs. Any suggestions as to how I can troubleshoot this?
Hi @dgiffin . Do you have something in temBoard UI logs ?
Can you provide an email address to send logs to for review?
@bsislow yep : etienne DOT bersac AT dalibo DOT com
Thanks Etienne, I've emailed the logs.
Any update? Should we reinstall the agent?
@bsislow yep, i don't see any errors, that's odd and need further investigation I can't remotely.
I've removed the agent and redeployed it... same issue. Please advise, this is urgent...
Examples of the issue:
Do you have errors in web console of the browser ?
No, none. Do you need a new log sent to your email?
I am seeing this in the temboard (not agent) logs now. When I removed the agent, I purged (/usr/share/temboard-agent/purge.sh
) it and also deleted it (Instances) from the dashboard.
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: Unhandled Error:
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: Traceback (most recent call last):
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 227, in error_middleware
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: return func(request, *args)
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 408, in user_instance_middleware
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: return func(request, address, port, *args)
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 267, in instance_middleware
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: return callable_(request, *args)
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 431, in user_middleware
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: return func(request, *args)
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: File "/usr/lib/python2.7/site-packages/temboardui/plugins/monitoring/handlers/alerting.py", line 48, in alerts
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: host_id, instance_id = get_request_ids(request)
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: File "/usr/lib/python2.7/site-packages/temboardui/plugins/monitoring/tools.py", line 93, in get_request_ids
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: host_id = get_host_id(request.db_session, request.instance.hostname)
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: File "/usr/lib/python2.7/site-packages/temboardui/plugins/monitoring/tools.py", line 75, in get_host_id
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: "Could not find registered host with hostname=%s" % hostname
2021-12-07 08:52:28,278 temboardui[18629]: [web] ERROR: Exception: Could not find registered host with hostname=************
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: Unhandled Error:
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: Traceback (most recent call last):
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 227, in error_middleware
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: return func(request, *args)
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 408, in user_instance_middleware
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: return func(request, address, port, *args)
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 267, in instance_middleware
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: return callable_(request, *args)
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: File "/usr/lib/python2.7/site-packages/temboardui/web.py", line 431, in user_middleware
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: return func(request, *args)
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: File "/usr/lib/python2.7/site-packages/temboardui/plugins/monitoring/handlers/alerting.py", line 91, in checks
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: host_id, instance_id = get_request_ids(request)
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: File "/usr/lib/python2.7/site-packages/temboardui/plugins/monitoring/tools.py", line 93, in get_request_ids
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: host_id = get_host_id(request.db_session, request.instance.hostname)
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: File "/usr/lib/python2.7/site-packages/temboardui/plugins/monitoring/tools.py", line 75, in get_host_id
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: "Could not find registered host with hostname=%s" % hostname
2021-12-07 08:52:28,281 temboardui[18629]: [web] ERROR: Exception: Could not find registered host with hostname=************
2021-12-07 08:52:28,282 temboardui[18629]: [web] ERROR: Request failed: 500 Could not find registered host with hostname=************.
2021-12-07 08:52:28,282 temboardui[18629]: [web] ERROR: Request failed: 500 Could not find registered host with hostname=************.
Exception: Could not find registered host with hostname=************
This errors means the monitoring plugin in UI never got data for this host.
Can you send me debug log of agent, for about 20 minutes of activity at least ? I don't see any monitoring task scheduled in the debug log you sent me.
Emailed.
FYI - this is now happening on another agent host; no data after reboot. I will get back to you shortly with UI logs via email...
Newest logs sent with DEBUG
for UI.
Also, I restarted the temboard UI to change the logging from INFO
to DEBUG
and that miraculously started populating monitoring charts...
For example (6 hours):
However, and this server has been around for several years, 30 days' of history is not available for example:
All charts look like this for 30 days... (No data available).
@bsislow, happy to know it's back. Could you keep logs verbose so as to see what happen if the issue reappear ? Can you try to reproduce the root of the error like killing an agent or restarting it ?
I will turn DEBUG
back on for the UI.
We will try to reproduce this.
Hi @bsislow did you reproduce this with 7.10 ?
We will have to see; we have yet to reboot a host :)
Closing stall issue. Please reopen if news.