amazon-textract-textractor icon indicating copy to clipboard operation
amazon-textract-textractor copied to clipboard

Use module name for logger instead of Root Logger

Open michaelshum321 opened this issue 1 year ago • 7 comments

Typically, it's best practice for Python logging to use logging.getLogger(__name__).

However, the ResponseParser simply does import logging and then logging.info(...) - this results in the root logger being used, as if the logger was logging.getLogger("root").

i.e. https://github.com/aws-samples/amazon-textract-textractor/blob/9df5d268dead3f42104cde2f766cb16be3f93d95/textractor/parsers/response_parser.py#L148

The logs created by the ResponseParser are many and spam our Server Logs. As a result, the only way to filter these logs is to apply a Logging Filter

class TextractFilter(logging.Filter):
    """
    Since Textract uses the root logger, we cannot set the logger level
    without affecting other usage of the root logger.
    With this filter, we are able to filter INFO logs from Textract.
    """

    def filter(self, record: logging.LogRecord):
        return not (record.module == "response_parser" and record.levelno == logging.INFO)


def configure_loggers():
    # This is for Textract to not spam the Server with INFO logs
    logging.getLogger("root").addFilter(TextractFilter())

Screen Shot 2024-05-20 at 2 01 31 PM

The Textract Filter works - but is generally not best practice when all I want to do is something like

logging.getLogger("textractor").setLevel(logging.WARNING)

Is there a better approach, or could we change the logger to use the module __name__ to be better configurable? thank you :)

michaelshum321 avatar May 20 '24 21:05 michaelshum321