Request and Task Diagnostics
Provide ability for user to click on a Request and see everything that happened to it in the UI.
- [ ] ManageIQ/manageiq-schema#657
- [ ] ~~ManageIQ/manageiq-ui-classic#8384~~ https://github.com/ManageIQ/manageiq-ui-classic/pull/8400
- [ ] ManageIQ/manageiq-automation_engine#503
- [ ] #22016
My notes:
- knowing what request are attached to which request tasks..maybe we just show all that in the UI directly (sort of a request graph - doesn't even have to be fancy)
- temp dirs for ansible runner: maybe we have the id in the dir name. They are removed normally, but when they aren't removed for debug it's hard to correlate
- (longer effort) a table for automate logs? maybe as a simple db-logger attached as a broadcast to the automations log, and then those records attached to a request. We originally wanted to wait for #19582, but maybe it's more tactical to have a simple DB table for tracking. When the request goes away, we can remove the entries.
List of EmbeddedAnsible issues:
https://github.com/ManageIQ/manageiq/issues?q=is%3Aopen+is%3Aissue+label%3A%22core%2Fembedded+ansible%22
Specifically:
https://github.com/ManageIQ/manageiq/issues/20243
- not having embedded ansible stdout on an automate method is not fun, we have it for direct services, but not via automate. We should collect that for automate as well.
@akhilkr128 can you take a look at this one? I can give you an overview if needed.
@Fryguy is https://github.com/ManageIQ/manageiq/pull/21681 part of this Epic or similar, but different?
I believe it is part of this epic... @akhilkr128 ?
No, that is a different task.
In the past:
- get
request_idfrom rack into thread local variable (like we do for locale) - ensure this
request_idis used in the log context. - pass
request_idinto the queue context. (currently userid and stuff like that is part of the context)
Then for the other non-ui tasks:
- the queue workers read
request_idand set the thread local variable. (including automate jobs) - the scheduler generates a
request_idfor use in each scheduled job. (we also used the task name as a "url") - automate may need to invent a
request_id. unsure.
I know the word request_id may seem like an httpd request concept, but we visualized it as a generic request/ask, whether coming from a cron system or from a browser.
To look up an issue:
- use the request parameters or controller name to find an entry in the logs that matches the issue.
- Extract the
request_idfrom the search result - Use
request_idto follow the issue across all systems.
We stored our logs in mongo/elastic search and were able to search the http request parameters, url, errors, and the request_id. There are probably much better log indexing systems these days.
Closing as all the dependent PRs are merged.