nautobot
nautobot copied to clipboard
Jobs Overhaul: Improve Logging Utilities
As ...
Patti - Platform Admin
I want ...
The ability to view all of the logs for a lob, not just the ones that are shown in the logging facility (e.g. self.log_success() )
So that ...
so that I can see full stacktraces and similar issues that I may not otherwise have easy access to.
I know this is done when...
There is a UI and API that provides feedback that is the same as I would see on the console of a worker in debug mode.
Optional - Feature groups this request pertains to.
- [ ] Automation
- [ ] Circuits
- [ ] DCIM
- [ ] IPAM
- [ ] Misc (including Data Sources)
- [ ] Organization
- [ ] Plugins (and other Extensibility)
- [ ] Security (Secrets, etc)
- [ ] Image Management
- [X] UI/UX
- [ ] Documentation
- [ ] Other (not directly a platform feature)
Database Changes
unknown
External Dependencies
No response
### Tasks
- [ ] Ditch bespoke logging levels and align standard logging levels (see: https://docs.python.org/3/library/logging.html#levels, we don't currently support `critical` and `success` does not exist)
- [ ] Use built-in Celery logging utility `get_task_logger` instead of `self.log_success()` as it's just a wrapper around normal Python logging conventions (see: https://docs.celeryq.dev/en/stable/userguide/tasks.html?highlight=get_task_logger#logging)
- [ ] Intercept Celery logging utilities to send to DB (see: https://docs.celeryq.dev/en/stable/userguide/signals.html?highlight=logging#after-setup-task-logger)
Considerations
- We don't need log levels outside the standard Python ones above
- stdout/stderr are redirected by default: https://docs.celeryq.dev/en/stable/userguide/configuration.html#std-setting-worker_redirect_stdouts so may "just work" but making them work should not be considered in-scope for this
- The default behavior redirects to a new
celery.get_logger
instance which does not implement our custom task database handler. Will probably need to use thetask_prerun
signal to implement thesys.stdout/sys.stderr
redirect.
- The default behavior redirects to a new
- Structlog support is a stretch goal and a separate issue
- If we can use the stdlib logger to keep the
log_object
field functionality, structlog support can be pushed back to a 2.x or beyond feature request (maybe use theextra
kwarg in the stdlib logger methods to pass the object)
- If we can use the stdlib logger to keep the
- Controlling emitted log-levels is in scope for #2071
I'd love to get rid of the bespoke logging API for Jobs and just let them use stdlib logging
in all cases.
Seems logical. My only request is a convention to still have the same thing. E.g. If you log your system, something like:
LOGGER.debug("NAUTOBOT_OBJ: <uuid> NAUTOBOT_DEBUG: Serial number data validation failed on the given object.")
You will get the same view that you currently have, in addition to the generic logging
That would be the tradeoff. Unless we want to use something like structlog
- I do not want to go in the direction of having Nautobot be responsible for parsing free-text logging messages to infer structure and semantics.
In the interest of clarity, I am all for using structlog or similar to provide the same functionality, and it becomes an "opt-in" type thing.
Worth considering as part of #765
Need to consider how this intersects with https://github.com/nautobot/nautobot/issues/765#issuecomment-1272388605 - the request there as I understand it would be to also have an option for jobs to not capture logs (or at least, not to store them to the database).
Don't know if it's already covered in some of the related issues, but it would be great to offer, off the shelf, a logging level option, matching the log level messages.
This way, if as a user, I want to get only > WARNING
message, I could select the WARNING
level.
Today, every Job reinvents the wheel for DEBUG
messages, with a boolean input and multiple if
statements.
Supporting the level filtering by default would simplify logging management.
Worth considering: will #1622 be able to address this with standard available tools (see: https://docs.celeryq.dev/en/stable/userguide/tasks.html#task-logging)
Let's discuss a spike after more of #1622 is completed.
- Instead of bespoke logging levels, let's align around the standard logging levels (see: https://docs.python.org/3/library/logging.html#levels, we don't currently support
critical
andsuccess
does not exist) - Using
get_task_logger
instead ofself.log_success
etc to accessing the logging utility- Needs interception to go to database: https://docs.celeryq.dev/en/stable/userguide/signals.html?highlight=logging#after-setup-task-logger
- stdout/stderr are redirected by default: https://docs.celeryq.dev/en/stable/userguide/configuration.html#std-setting-worker_redirect_stdouts so may "just work" but shouldn't be over prioritized
- [ ] Separate issue (stretch goal): Support structlog
- Possibly use Django-Structlog
-
Possibly move to log entries message in DB from being strings to a JSON Serialize
- https://github.com/nautobot/nautobot/blob/develop/nautobot/extras/models/jobs.py#L500-L508 becomes just a single payload (except
created
)
- https://github.com/nautobot/nautobot/blob/develop/nautobot/extras/models/jobs.py#L500-L508 becomes just a single payload (except
Don't know if it's already covered in some of the related issues, but it would be great to offer, off the shelf, a logging level option, matching the log level messages. This way, if as a user, I want to get only
> WARNING
message, I could select theWARNING
level. Today, every Job reinvents the wheel forDEBUG
messages, with a boolean input and multipleif
statements. Supporting the level filtering by default would simplify logging management.
This will be addressed in: https://github.com/nautobot/nautobot/issues/2071
If we don't achieve the stretch goal do we need to make the JobLogEntry.log_object
field a JSONField
for job authors that need to log debug messaging with more complex data structures?
Yes, or more generally we need to solve the problem of the existing non-message
fields on JobLogEntry
and figure out how to accommodate those in the updated logging API we use.
The active_test
job property should go away as part of this issue as well and potentially use the JobResult.status
if that's something that still needs to be tracked in the job's logs.
Closed with #3799
Have spoke with Nautobot core since this was initially closed, but the core of this issue has not addressed as of yet, which is:
There is a UI and API that provides feedback that is the same as I would see on the console of a worker in debug mode.
I understand there has been adjustments made to the logging facility, but the challenge is always the same, we do not know own the logging for non-Nautobot packages, so if Python requests has an error, we do not control that. The only constant I have found is what has been mentioned in the request of the issue, which is the console of a worker logs (at the time it said debug mode as that was a thing in Nautobot 1.x).