blog
blog copied to clipboard
How to handle multi process logging in Python?
Python's logging
module provides a list of super useful handlers to handle/redirect log messages to required target destinations. For instance FileHandler
sends the messages to a file, DatagramHandler
sends to UDP ports, etc. In a multi process setup however, using FileHandler
to redirect logs to the same file from different processes would only corrupt the log file. Explicitly acquiring locks to the same file is a bad thing to do.
An out-of-the-box way is to use this package called multiprocess-logging
. This pip package implements a custom log handler. The handler receives all the messages, puts them in an internal queue and emits by dequeuing messages from the queue.
Aggregating the log messages to a queue is the only way to handle this scenario. But there is a graceful alternative way to achieve this. Python's logging module provides built in mechanisms to handle queue based logging with QueueHandler
and QueueListener
classes.
Similar to how FileHandler
provides a way to do file based logging, QueueHandler
helps to log to a queue object. This queue can be from the queue
module or multiprocessing.Queue
.
import logging
from logging.handlers import QueueHandler
from queue import SimpleQueue
logging.warning("Test this warning in the console")
q = SimpleQueue()
queue_handler = QueueHandler(q)
root_logger = logging.getLogger()
# remove the StreamHandler instance that is set by default
root_logger.handlers.pop()
# add queue handler
# this will stop the log messages from printing in the console
root_logger.addHandler(queue_handler)
logging.warning("Test this warning in the queue")
print(q.get())
This prints:
WARNING:root:Test this warning in the console
<LogRecord: root, 30, test.py, 14, "Test this warning in the queue">
Note the second printed line is only because of the print
function. The second logging.warning
function has consumed the message and put it in the q
object.
QueueListener
is used along with the QueueHandler
to collect messages from the queue and push them to various other targets such as a file, UDP port, etc.
...
file_handler = FileHandler("out.log")
listener = QueueListener(q, file_handler)
listener.start()
...
The listener
looks for messages in q
and pushes them to file_handler
which writes to out.log
. Let's use the same flow in a multi process setup.
Assume a master process that spawns 4 children and all log messages are to be written to out.log
file. Any logger message from the children will be written to a multiprocessing.Queue
. The children need to initialize a QueueHandler
and attach it to their root logger as seen above to achieve this. The same queue is listened at the master process's QueueListener
.
Here's how the master process's logging is setup:
def setup_primary_logging():
log_queue = mp.Queue(-1)
# Handlers for stream/file logging
output_file_log_handler = logging.FileHandler(filename="out.log")
formatter = logging.Formatter('%(asctime)s - %(levelname)s - %(message)s')
output_file_log_handler.setFormatter(formatter)
output_file_log_handler.setLevel(logging.INFO)
# This listener listens to the `log_queue` and pushes the messages to the list of
# handlers specified.
listener = QueueListener(log_queue, output_file_log_handler, error_file_log_handler, respect_handler_level=True)
listener.start()
return log_queue
And the worker's logging setup:
def setup_worker_logging(log_queue):
queue_handler = QueueHandler(log_queue)
queue_handler.setLevel(logging.INFO)
root_logger = logging.getLogger()
root_logger.addHandler(queue_handler)
# Default logger level is WARNING, hence the change. Otherwise, any worker logs
# are not going to get bubbled up to the parent's logger handlers from where the
# actual logs are written to the output
root_logger.setLevel(logging.INFO)
Note that log_queue
object has to be sent from the master to all the children. So the master process looks something like this:
def worker_fn(log_queue):
setup_worker_logging(log_queue)
def master_fn():
log_queue = setup_primary_logging()
mp.spawn(worker_fn, args=(log_queue,), n_procs=4)
Any messages from the worker will be logged to out.log
file with the above setup. A complete working example can be seen in this gist.