amazon-textract-textractor
amazon-textract-textractor copied to clipboard
Use module name for logger instead of Root Logger
Typically, it's best practice for Python logging to use logging.getLogger(__name__).
However, the ResponseParser simply does import logging and then logging.info(...) - this results in the root logger being used, as if the logger was logging.getLogger("root").
i.e. https://github.com/aws-samples/amazon-textract-textractor/blob/9df5d268dead3f42104cde2f766cb16be3f93d95/textractor/parsers/response_parser.py#L148
The logs created by the ResponseParser are many and spam our Server Logs. As a result, the only way to filter these logs is to apply a Logging Filter
class TextractFilter(logging.Filter):
"""
Since Textract uses the root logger, we cannot set the logger level
without affecting other usage of the root logger.
With this filter, we are able to filter INFO logs from Textract.
"""
def filter(self, record: logging.LogRecord):
return not (record.module == "response_parser" and record.levelno == logging.INFO)
def configure_loggers():
# This is for Textract to not spam the Server with INFO logs
logging.getLogger("root").addFilter(TextractFilter())
The Textract Filter works - but is generally not best practice when all I want to do is something like
logging.getLogger("textractor").setLevel(logging.WARNING)
Is there a better approach, or could we change the logger to use the module __name__ to be better configurable? thank you :)