nautobot icon indicating copy to clipboard operation
nautobot copied to clipboard

Jobs Overhaul: Improve Logging Utilities

Open itdependsnetworks opened this issue 2 years ago • 14 comments

As ...

Patti - Platform Admin

I want ...

The ability to view all of the logs for a lob, not just the ones that are shown in the logging facility (e.g. self.log_success() )

So that ...

so that I can see full stacktraces and similar issues that I may not otherwise have easy access to.

I know this is done when...

There is a UI and API that provides feedback that is the same as I would see on the console of a worker in debug mode.

Optional - Feature groups this request pertains to.

  • [ ] Automation
  • [ ] Circuits
  • [ ] DCIM
  • [ ] IPAM
  • [ ] Misc (including Data Sources)
  • [ ] Organization
  • [ ] Plugins (and other Extensibility)
  • [ ] Security (Secrets, etc)
  • [ ] Image Management
  • [X] UI/UX
  • [ ] Documentation
  • [ ] Other (not directly a platform feature)

Database Changes

unknown

External Dependencies

No response


### Tasks
- [ ] Ditch bespoke logging levels and align standard logging levels (see: https://docs.python.org/3/library/logging.html#levels, we don't currently support `critical` and `success` does not exist)
- [ ] Use built-in Celery logging utility `get_task_logger` instead of `self.log_success()` as it's just a wrapper around normal Python logging conventions (see: https://docs.celeryq.dev/en/stable/userguide/tasks.html?highlight=get_task_logger#logging)
- [ ] Intercept Celery logging utilities to send to DB (see: https://docs.celeryq.dev/en/stable/userguide/signals.html?highlight=logging#after-setup-task-logger)

Considerations

  • We don't need log levels outside the standard Python ones above
  • stdout/stderr are redirected by default: https://docs.celeryq.dev/en/stable/userguide/configuration.html#std-setting-worker_redirect_stdouts so may "just work" but making them work should not be considered in-scope for this
    • The default behavior redirects to a new celery.get_logger instance which does not implement our custom task database handler. Will probably need to use the task_prerun signal to implement the sys.stdout/sys.stderr redirect.
  • Structlog support is a stretch goal and a separate issue
    • If we can use the stdlib logger to keep the log_object field functionality, structlog support can be pushed back to a 2.x or beyond feature request (maybe use the extra kwarg in the stdlib logger methods to pass the object)
  • Controlling emitted log-levels is in scope for #2071

itdependsnetworks avatar Aug 29 '22 15:08 itdependsnetworks

I'd love to get rid of the bespoke logging API for Jobs and just let them use stdlib logging in all cases.

glennmatthews avatar Aug 29 '22 15:08 glennmatthews

Seems logical. My only request is a convention to still have the same thing. E.g. If you log your system, something like:

LOGGER.debug("NAUTOBOT_OBJ: <uuid>  NAUTOBOT_DEBUG: Serial number data validation failed on the given object.")

You will get the same view that you currently have, in addition to the generic logging

itdependsnetworks avatar Aug 29 '22 15:08 itdependsnetworks

That would be the tradeoff. Unless we want to use something like structlog - I do not want to go in the direction of having Nautobot be responsible for parsing free-text logging messages to infer structure and semantics.

glennmatthews avatar Aug 29 '22 16:08 glennmatthews

In the interest of clarity, I am all for using structlog or similar to provide the same functionality, and it becomes an "opt-in" type thing.

itdependsnetworks avatar Sep 06 '22 13:09 itdependsnetworks

Worth considering as part of #765

glennmatthews avatar Oct 04 '22 12:10 glennmatthews

Need to consider how this intersects with https://github.com/nautobot/nautobot/issues/765#issuecomment-1272388605 - the request there as I understand it would be to also have an option for jobs to not capture logs (or at least, not to store them to the database).

glennmatthews avatar Oct 13 '22 13:10 glennmatthews

Don't know if it's already covered in some of the related issues, but it would be great to offer, off the shelf, a logging level option, matching the log level messages. This way, if as a user, I want to get only > WARNING message, I could select the WARNING level. Today, every Job reinvents the wheel for DEBUG messages, with a boolean input and multiple if statements. Supporting the level filtering by default would simplify logging management.

chadell avatar Nov 17 '22 06:11 chadell

Worth considering: will #1622 be able to address this with standard available tools (see: https://docs.celeryq.dev/en/stable/userguide/tasks.html#task-logging)

Let's discuss a spike after more of #1622 is completed.

bryanculver avatar Jan 23 '23 18:01 bryanculver

  • Instead of bespoke logging levels, let's align around the standard logging levels (see: https://docs.python.org/3/library/logging.html#levels, we don't currently support critical and success does not exist)
  • Using get_task_logger instead of self.log_success etc to accessing the logging utility
    • Needs interception to go to database: https://docs.celeryq.dev/en/stable/userguide/signals.html?highlight=logging#after-setup-task-logger
  • stdout/stderr are redirected by default: https://docs.celeryq.dev/en/stable/userguide/configuration.html#std-setting-worker_redirect_stdouts so may "just work" but shouldn't be over prioritized
  • [ ] Separate issue (stretch goal): Support structlog
    • Possibly use Django-Structlog
    • Possibly move to log entries message in DB from being strings to a JSON Serialize
      • https://github.com/nautobot/nautobot/blob/develop/nautobot/extras/models/jobs.py#L500-L508 becomes just a single payload (except created)

Don't know if it's already covered in some of the related issues, but it would be great to offer, off the shelf, a logging level option, matching the log level messages. This way, if as a user, I want to get only > WARNING message, I could select the WARNING level. Today, every Job reinvents the wheel for DEBUG messages, with a boolean input and multiple if statements. Supporting the level filtering by default would simplify logging management.

This will be addressed in: https://github.com/nautobot/nautobot/issues/2071

bryanculver avatar Apr 04 '23 16:04 bryanculver

If we don't achieve the stretch goal do we need to make the JobLogEntry.log_object field a JSONField for job authors that need to log debug messaging with more complex data structures?

gsnider2195 avatar Apr 05 '23 14:04 gsnider2195

Yes, or more generally we need to solve the problem of the existing non-message fields on JobLogEntry and figure out how to accommodate those in the updated logging API we use.

glennmatthews avatar Apr 05 '23 15:04 glennmatthews

The active_test job property should go away as part of this issue as well and potentially use the JobResult.status if that's something that still needs to be tracked in the job's logs.

gsnider2195 avatar Apr 05 '23 15:04 gsnider2195

Closed with #3799

bryanculver avatar May 30 '23 15:05 bryanculver

Have spoke with Nautobot core since this was initially closed, but the core of this issue has not addressed as of yet, which is:

There is a UI and API that provides feedback that is the same as I would see on the console of a worker in debug mode.

I understand there has been adjustments made to the logging facility, but the challenge is always the same, we do not know own the logging for non-Nautobot packages, so if Python requests has an error, we do not control that. The only constant I have found is what has been mentioned in the request of the issue, which is the console of a worker logs (at the time it said debug mode as that was a thing in Nautobot 1.x).

itdependsnetworks avatar Mar 20 '24 01:03 itdependsnetworks