loguru icon indicating copy to clipboard operation
loguru copied to clipboard

[Question] Is there a way to show logs on screen on windows for multiprocessing?

Open monchin opened this issue 1 year ago • 7 comments

Hi, I have read the document and have known that on windows, if I want to use multirprocessing, I need

logger.remove()
logger.add("file.log", enqueue=True)

But in this way, I need to continuously open file.log manually to see what has happened, which is a little unconvinient. So I wonder if there is any way to show logs on screen on windows even I use multiprocessing?

monchin avatar Jan 08 '24 06:01 monchin

Have you tried logging to stderr instead of a file?

import sys
...
logger.remove()
logger.add(sys.stderr, enqueue=True)

bojobo avatar Jan 10 '24 09:01 bojobo

Have you tried logging to stderr instead of a file?

import sys
...
logger.remove()
logger.add(sys.stderr, enqueue=True)

I haven't, because I saw the document, and it said that Default "sys.stderr" sink is not picklable, I think this means when using multiprocessing on windows, we can't use loguru with sys.stderr

monchin avatar Jan 11 '24 08:01 monchin

@monchin Both sys.stderr and "file.log" handlers aren't picklable by default. To use them properly with multiprocessing, they need to be configured with enqueue=True. When enqueue=True, they became picklable.

The documentation refers to the default sys.stderr sink pre-configured when importing Loguru. This one was added with enqueue=False and therefore can't be pickled. However, as @bojobo suggested, using logger.add(sys.stderr, enqueue=True) is perfectly fine and should resolve your problem.

I think the documentation is inaccurate, instead of Default "sys.stderr" sink is not picklable it should be Default "sys.stderr" handler is not picklable.

Delgan avatar Jan 12 '24 16:01 Delgan

@monchin Both sys.stderr and "file.log" handlers aren't picklable by default. To use them properly with multiprocessing, they need to be configured with enqueue=True. When enqueue=True, they became picklable.

The documentation refers to the default sys.stderr sink pre-configured when importing Loguru. This one was added with enqueue=False and therefore can't be pickled. However, as @bojobo suggested, using logger.add(sys.stderr, enqueue=True) is perfectly fine and should resolve your problem.

I think the documentation is inaccurate, instead of Default "sys.stderr" sink is not picklable it should be Default "sys.stderr" handler is not picklable.

Thank you for your reply! I now temporarily give up to use loguru because in windows, if I use loguru, it seems that I need to pass arg logger_ in every function I would logging something, and when the calling chain is deep, it would be inconvinient.

Before I knew loguru and now after I give up loguru, I'm using logging.handlers.QueueHandler, on each processing, I need to initialize logger once like

import logging
import logging.handlers
import multiprocessing as mp

def  init_qh_logger(q:mp.Queue, log_lv=logging.DEBUG):
    logger = logging.getLogger()
    logger.setLevel(log_lv)
    qh = logging.handlers.QueueHandler(q)
    logger.addHandler(qh)

def new_proc(q:mp.Queue):
    init_qh_logger(q)
    logger = logging.getLogger()
    logger.info("some info here")
    
    some_other_func_in_same_proc()

def some_other_func_in_same_proc():
    logger = logging.getLogger()
    logger.warning("warning here")

In this way, I just need to initilize logger once in one processing, even I use a long calling chain, I don't need to add the arg logger_ in each function. And I just need to deal with logging messages in the queue in a new processing or threading at the top program

def logging_consumer(q:mp.Queue):
    while True:
        record = q.get()
        if record is None:
            break
        logger = logging.getLogger(record.name)
        logger.handle(record)

I' not sure if it is a good idea to make this implement an option in loguru, like

import multiprocessing as mp
from loguru import logger

def new_proc():
    logger.enqueue() # means in this proc, logging would add QueueHandler
    logger.info("info here")
    function()
def funtion():
    logger.warning("warning here")

if __name__ == "__main__":
    logger.use_multiprocessing()
    logger.add("file.log")
    logger.enqueue()
    p = mp.Process(target=new_proc)
    p.start()
    logger.info("main proc here")
    p.join()

I mean that as soon as logger.use_multiprocessing() is executed, loguru would automatically generate a processing or a threading to take logging record from a internal mp.Queue (or maybe external would be OK), all logger.add() sinks would be put in this processing or thread, and users just need to logger.enqueue() at the beginning of each processing, and enqueue=True is not need. In this way maybe using multiprocessing on windows would be eaiser and codes crossing OS would be more uniform. The interfaces and inplement method could be different, but what do you think about this idea?

monchin avatar Jan 15 '24 05:01 monchin

@monchin Thanks for the feedback. I agree with you that Loguru isn't very convenient to use with multiprocessing and you're not the first one to express this.

You've got a good idea, and in fact I've been planning and mentioning something similar here: https://github.com/Delgan/loguru/issues/818#issuecomment-1538688196

I'm envisioning two new APIs. The first one is logger.reinstall()and would need to be called once in the child process, but will update the logger "globally". This eliminates the need to pass the logger to each functions.

def new_proc(logger):
    logger.reinstall()
    logger.info("some info here")
    some_other_func_in_same_proc()

def some_other_func_in_same_proc():
    logger.warning("warning here")

if __name__ == "__main__":
    logger.add("file.log", enqueue=True)
    p = mp.Process(target=new_proc, args=(logger, ))
    p.start()
    logger.info("main proc here")
    p.join()

Passing the logger to new_proc is still required, though. This is because it contains a Queue internally and there is no way for the child process to access it otherwise, and the Queue is in charge of synchronizing the messages.

The second method would be something like logger.interconnect() which install a server/client allowing child processes to communicate with the main one.

if __name__ == "__main__":  # Main process
    logger.interconnect(serve=True)
else:  # Child process
    logger.interconnect(server="127.0.0.1")

def new_proc():
    logger.info("some info here")
    some_other_func_in_same_proc()

def some_other_func_in_same_proc():
    logger.warning("warning here")

if __name__ == "__main__":
    logger.add("file.log")
    p = mp.Process(target=new_proc)
    p.start()
    logger.info("main proc here")
    p.join()

In this second approach, a TCP socket is used internally, and there is no need to pass the Queue around.

Delgan avatar Jan 15 '24 20:01 Delgan

@monchin Thanks for the feedback. I agree with you that Loguru isn't very convenient to use with multiprocessing and you're not the first one to express this.

You've got a good idea, and in fact I've been planning and mentioning something similar here: #818 (comment)

I'm envisioning two new APIs. The first one is logger.reinstall()and would need to be called once in the child process, but will update the logger "globally". This eliminates the need to pass the logger to each functions.

def new_proc(logger):
    logger.reinstall()
    logger.info("some info here")
    some_other_func_in_same_proc()

def some_other_func_in_same_proc():
    logger.warning("warning here")

if __name__ == "__main__":
    logger.add("file.log", enqueue=True)
    p = mp.Process(target=new_proc, args=(logger, ))
    p.start()
    logger.info("main proc here")
    p.join()

Passing the logger to new_proc is still required, though. This is because it contains a Queue internally and there is no way for the child process to access it otherwise, and the Queue is in charge of synchronizing the messages.

The second method would be something like logger.interconnect() which install a server/client allowing child processes to communicate with the main one.

if __name__ == "__main__":  # Main process
    logger.interconnect(serve=True)
else:  # Child process
    logger.interconnect(server="127.0.0.1")

def new_proc():
    logger.info("some info here")
    some_other_func_in_same_proc()

def some_other_func_in_same_proc():
    logger.warning("warning here")

if __name__ == "__main__":
    logger.add("file.log")
    p = mp.Process(target=new_proc)
    p.start()
    logger.info("main proc here")
    p.join()

In this second approach, a TCP socket is used internally, and there is no need to pass the Queue around.

@Delgan I tried to implement your first idea, very simple, just

# _logger.py
import os

class Logger:
    def __init__(self, core, *args):
        self._core = core
        self._options = tuple(args)
        self._onw_pid = os.getpid()

    def _replace_core(self, core: Core):
        self._core = core

    def reinstall(self):
        if self._own_pid == os.getpid():  # same process
            return
        from loguru import logger

        logger._replace_core(self._core)

And I tried in tests/test_multiprocessing.py

# tests/test_multiprocessing.py
def subworker_spawn(logger_):
    logger_.reinstall()
    logger.info("Child")
    deeper_subworker()

def deeper_subworker():
    logger.info("Grandchild")


def test_process_spawn(spawn_context):
    writer = Writer()

    logger.add(writer, context=spawn_context, format="{message}", enqueue=True, catch=False)

    process = spawn_context.Process(target=subworker_spawn, args=(logger,))
    process.start()
    process.join()

    assert process.exitcode == 0

    logger.info("Main")
    logger.remove()

    assert writer.read() == "Child\nGrandchild\nMain\n"

It seems it works fine. I'm not sure if it is OK.

For your second idea, in my knowledge, I think you must specify not only IP but also a port number, so if there are many different python main programs using loguru simultaneously, they must use different ports. Perhaps it is a little hard to decide the port number, or you must let users to specify it at the very beginning of the program.

monchin avatar Jan 20 '24 12:01 monchin

It seems it works fine. I'm not sure if it is OK.

At first glance, I would say it's fine.

For your second idea, in my knowledge, I think you must specify not only IP but also a port number, so if there are many different python main programs using loguru simultaneously, they must use different ports. Perhaps it is a little hard to decide the port number, or you must let users to specify it at the very beginning of the program.

You're right. I plan to set a configurable default port. I don't think we can do otherwise.

Delgan avatar Jan 20 '24 13:01 Delgan