manageiq icon indicating copy to clipboard operation
manageiq copied to clipboard

Request and Task Diagnostics

Open chessbyte opened this issue 5 years ago • 8 comments

Provide ability for user to click on a Request and see everything that happened to it in the UI.

  • [ ] ManageIQ/manageiq-schema#657
  • [ ] ~~ManageIQ/manageiq-ui-classic#8384~~ https://github.com/ManageIQ/manageiq-ui-classic/pull/8400
  • [ ] ManageIQ/manageiq-automation_engine#503
  • [ ] #22016

chessbyte avatar Apr 21 '21 14:04 chessbyte

My notes:

  • knowing what request are attached to which request tasks..maybe we just show all that in the UI directly (sort of a request graph - doesn't even have to be fancy)
  • temp dirs for ansible runner: maybe we have the id in the dir name. They are removed normally, but when they aren't removed for debug it's hard to correlate
  • (longer effort) a table for automate logs? maybe as a simple db-logger attached as a broadcast to the automations log, and then those records attached to a request. We originally wanted to wait for #19582, but maybe it's more tactical to have a simple DB table for tracking. When the request goes away, we can remove the entries.

Fryguy avatar Apr 21 '21 14:04 Fryguy

List of EmbeddedAnsible issues:

https://github.com/ManageIQ/manageiq/issues?q=is%3Aopen+is%3Aissue+label%3A%22core%2Fembedded+ansible%22

Specifically:

https://github.com/ManageIQ/manageiq/issues/20243

NickLaMuro avatar Apr 21 '21 14:04 NickLaMuro

  • not having embedded ansible stdout on an automate method is not fun, we have it for direct services, but not via automate. We should collect that for automate as well.

Fryguy avatar Apr 21 '21 14:04 Fryguy

@akhilkr128 can you take a look at this one? I can give you an overview if needed.

Fryguy avatar Oct 26 '21 15:10 Fryguy

@Fryguy is https://github.com/ManageIQ/manageiq/pull/21681 part of this Epic or similar, but different?

chessbyte avatar Feb 23 '22 15:02 chessbyte

I believe it is part of this epic... @akhilkr128 ?

Fryguy avatar Feb 23 '22 18:02 Fryguy

No, that is a different task.

akhilkr128 avatar Mar 28 '22 13:03 akhilkr128

In the past:

  • get request_id from rack into thread local variable (like we do for locale)
  • ensure this request_id is used in the log context.
  • pass request_id into the queue context. (currently userid and stuff like that is part of the context)

Then for the other non-ui tasks:

  • the queue workers read request_id and set the thread local variable. (including automate jobs)
  • the scheduler generates a request_id for use in each scheduled job. (we also used the task name as a "url")
  • automate may need to invent a request_id. unsure.

I know the word request_id may seem like an httpd request concept, but we visualized it as a generic request/ask, whether coming from a cron system or from a browser.

To look up an issue:

  • use the request parameters or controller name to find an entry in the logs that matches the issue.
  • Extract the request_id from the search result
  • Use request_id to follow the issue across all systems.

We stored our logs in mongo/elastic search and were able to search the http request parameters, url, errors, and the request_id. There are probably much better log indexing systems these days.

kbrock avatar Mar 29 '22 18:03 kbrock

Closing as all the dependent PRs are merged.

jrafanie avatar Feb 14 '23 15:02 jrafanie