loguru icon indicating copy to clipboard operation
loguru copied to clipboard

add extra to default text output format

Open trim21 opened this issue 1 year ago • 13 comments

This is a suggestion to change default loguru text output format, or a new option to append extra to non-serialized text format.

There are many structured logging will include extra to text output, but loguru just ignore them.

Hope loguru could support this. For example, append key1=value1 key2=value2 to default text output.

so a code snippet will output 2024-09-12 00:17:01.261 | INFO | __main__:<module>:5 - hello world task=main pid=1

from loguru import logger


with logger.contextualize(task="main", pid=1):
    logger.info("hello world")

golang stdlib log/slog:

2022/11/08 15:28:26 INFO hello count=3

zerolog(golang) image

pino(nodejs) image

structlog(python) image

trim21 avatar Sep 11 '24 16:09 trim21

Thanks for the comparison with different frameworks. That's very helpful.

I'd already thought about it and I'm all for it. Structured logging is the way to go.

My plan is to add a new entry to the log record, for example a data dict. This would be filled by the keyword arguments of the logged message instead of extra currently. This is because it should be viewed as semantically different.

Then, add {data} to the current default format, which would be nicely formatted (if non-empty), e.g.:

logger.info("Hello world", foo="bar", baz=123)
# => 2024-09-22 19:39:58.381 | INFO     | __main__:<module>:1 - Hello world (foo="bar", baz=123)

Delgan avatar Sep 22 '24 17:09 Delgan

will kwargs from logger.contextualize be included?

trim21 avatar Sep 22 '24 17:09 trim21

Initially, I believed it should not, but maybe I'm wrong. I haven't given it enough thought. I need to look into the current usage made of contextualize() in relation to bind().

Delgan avatar Sep 22 '24 17:09 Delgan

that should be really useful.

but for example :

logger.opt().bind(a=1).info("good {}, nice {b}", "morning", b="job", c="come on")

may be logout:
# => 2024-09-22 19:39:58.381 | INFO     | __main__:<module>:1 - good morning, nice job (c="come on")

should "a=1" included?

and I also read loguru source code, it can not clearly distinguish between 'bind' and 'format-placeholder', maybe we should add some new func with propagation to handler this.

boring but works like this

logger.opt().with(a=1, b="2").with(c=3).info("xxx")
# => - xxx (a=1, b="2", c=3)

elvizlai avatar Dec 11 '24 02:12 elvizlai

that should be really useful.

but for example :

logger.opt().bind(a=1).info("good {}, nice {b}", "morning", b="job", c="come on")

may be logout:
# => 2024-09-22 19:39:58.381 | INFO     | __main__:<module>:1 - good morning, nice job (c="come on")

should "a=1" included?

and I also read loguru source code, it can not clearly distinguish between 'bind' and 'format-placeholder', maybe we should add some new func with propagation to handler this.

boring but works like this

logger.opt().with(a=1, b="2").with(c=3).info("xxx")
# => - xxx (a=1, b="2", c=3)

I think it should

trim21 avatar Dec 19 '24 23:12 trim21

@elvizlai Regarding the following example:

logger.bind(a=1).info("good {}, nice {b}", "morning", b="job", c="come on")

I think that the next version of Loguru must clearly distinguish between :

  1. Parameters intended for formatting ("morning" here).
  2. Parameters intended for contextual / structured logging (c="come on" here).
  3. Parameters intended to identify logger instances and specialize handlers (a=1 here).

Currently there is some undesirable overlap: keyword arguments end up both in the extra record dict and are also used for formatting. They have the same behavior as bind() and positional arguments. This design has led to a lot of problems and complications, and I think it's imperative that we rethink it.

I was planning to do is the following:

  1. Positional arguments will be reserved for formatting only, as it is currently.
  2. Keyword arguments will no longer be used for formatting. Instead, they will fill a new data entry in the record dict. The default format will also contains a new {data} field that will automatically format the values as discussed in this ticket.
  3. The bind() method will stay as is, but will be the only way to populate the extra entry in the record dict.

In such scenario, that means the bind() values and keyword arguments are not mixed, and therefore, the bind() values would not appear by default in the formatted {data}.

This is because, I believe keyword arguments to be contextual to the log line, and I don't want this to inadvertently override a value configured by bind(). For example:

from loguru import logger

# Imagine a filter or sink implements a specialization for this named logger.
named_logger = logger.bind(name="my-specific-logger")

named_logger.info("App initialized", count=123)
named_logger.info("New user connected", name=username)  # Oops!

I believe there should be a conceptual difference between bind() usage and keyword arguments. Do you see it otherwise @trim21?

Delgan avatar Dec 30 '24 18:12 Delgan

missing logger.contextualize

trim21 avatar Dec 31 '24 00:12 trim21

missing logger.contextualize

Yes, I'm not sure yet if it falls more into the category of bind() or contextual keyword arguments. I remember we shortly discussed this before.

Delgan avatar Dec 31 '24 13:12 Delgan

I vote for .bind() because it's implicit like .bind()

trim21 avatar Dec 31 '24 13:12 trim21

Yes, it's also what seems most natural to me.

Delgan avatar Dec 31 '24 13:12 Delgan

Hello hello!

Just my 5 cents on contextualize: it would be really nice to see bound variable in normal (non-structured) messages too!

Often, I use structlog for this:

import structlog
from structlog.contextvars import (
    bound_contextvars,
)

structlog_logger = structlog.get_logger()
for item_id in range(5):
    with bound_contextvars(item_id=item_id):
        structlog_logger.info("started processing item")

        if item_id % 3 == 0:
            structlog_logger.info("skip: bad item id")
            continue

        structlog_logger.info("successfully processed item")

It is very convenient when dealing with deeply nested loops / conditions.

Output:

2025-01-16 10:42:09 [info     ] started processing item        item_id=0
2025-01-16 10:42:09 [info     ] skip: bad item id              item_id=0
2025-01-16 10:42:09 [info     ] started processing item        item_id=1
2025-01-16 10:42:09 [info     ] successfully processed item    item_id=1
2025-01-16 10:42:09 [info     ] started processing item        item_id=2
2025-01-16 10:42:09 [info     ] successfully processed item    item_id=2
2025-01-16 10:42:09 [info     ] started processing item        item_id=3
2025-01-16 10:42:09 [info     ] skip: bad item id              item_id=3
2025-01-16 10:42:09 [info     ] started processing item        item_id=4
2025-01-16 10:42:09 [info     ] successfully processed item    item_id=4

But I really like all the convenience features of loguru, and it would be really nice to have similar behavior here too:

from loguru import logger as loguru_logger

for item_id in range(5):
    with loguru_logger.contextualize(item_id=item_id):
        loguru_logger.info("started processing item")

        if item_id % 3 == 0:
            loguru_logger.info("skip: bad item id")
            continue

        loguru_logger.info("successfully processed item")

Output:

2025-01-16 10:48:40.793 | INFO     | __main__:<module>:34 - started processing item
2025-01-16 10:48:40.793 | INFO     | __main__:<module>:37 - skip: bad item id
2025-01-16 10:48:40.793 | INFO     | __main__:<module>:34 - started processing item
2025-01-16 10:48:40.793 | INFO     | __main__:<module>:40 - successfully processed item
2025-01-16 10:48:40.793 | INFO     | __main__:<module>:34 - started processing item
2025-01-16 10:48:40.793 | INFO     | __main__:<module>:40 - successfully processed item
2025-01-16 10:48:40.793 | INFO     | __main__:<module>:34 - started processing item
2025-01-16 10:48:40.793 | INFO     | __main__:<module>:37 - skip: bad item id
2025-01-16 10:48:40.793 | INFO     | __main__:<module>:34 - started processing item
2025-01-16 10:48:40.793 | INFO     | __main__:<module>:40 - successfully processed item

Unfortunately, the output does not contain any variables from contextualize.

To make it work, I have to use serialize and play around with removing / adding sinks (arguably, boilerplate):

import sys

from loguru import logger as loguru_logger

loguru_logger.remove()
loguru_logger.add(sys.stdout, serialize=True)

Output:

{"text": "2025-01-16 10:42:09.447 | INFO     | __main__:<module>:34 - started processing item\n", "record": {"elapsed": {"repr": "0:00:00.006409", "seconds": 0.006409}, "exception": null, "extra": {"item_id": 0}, "file": {"name": "t.py", "path": "/Users/senpos/personal/scratch/t.py"}, "function": "<module>", "level": {"icon": "ℹ️", "name": "INFO", "no": 20}, "line": 34, "message": "started processing item", "module": "t", "name": "__main__", "process": {"id": 2011, "name": "MainProcess"}, "thread": {"id": 8352744000, "name": "MainThread"}, "time": {"repr": "2025-01-16 10:42:09.447282+02:00", "timestamp": 1737016929.447282}}}
{"text": "2025-01-16 10:42:09.447 | INFO     | __main__:<module>:37 - skip: bad item id\n", "record": {"elapsed": {"repr": "0:00:00.006508", "seconds": 0.006508}, "exception": null, "extra": {"item_id": 0}, "file": {"name": "t.py", "path": "/Users/senpos/personal/scratch/t.py"}, "function": "<module>", "level": {"icon": "ℹ️", "name": "INFO", "no": 20}, "line": 37, "message": "skip: bad item id", "module": "t", "name": "__main__", "process": {"id": 2011, "name": "MainProcess"}, "thread": {"id": 8352744000, "name": "MainThread"}, "time": {"repr": "2025-01-16 10:42:09.447381+02:00", "timestamp": 1737016929.447381}}}
{"text": "2025-01-16 10:42:09.447 | INFO     | __main__:<module>:34 - started processing item\n", "record": {"elapsed": {"repr": "0:00:00.006561", "seconds": 0.006561}, "exception": null, "extra": {"item_id": 1}, "file": {"name": "t.py", "path": "/Users/senpos/personal/scratch/t.py"}, "function": "<module>", "level": {"icon": "ℹ️", "name": "INFO", "no": 20}, "line": 34, "message": "started processing item", "module": "t", "name": "__main__", "process": {"id": 2011, "name": "MainProcess"}, "thread": {"id": 8352744000, "name": "MainThread"}, "time": {"repr": "2025-01-16 10:42:09.447434+02:00", "timestamp": 1737016929.447434}}}
{"text": "2025-01-16 10:42:09.447 | INFO     | __main__:<module>:40 - successfully processed item\n", "record": {"elapsed": {"repr": "0:00:00.006593", "seconds": 0.006593}, "exception": null, "extra": {"item_id": 1}, "file": {"name": "t.py", "path": "/Users/senpos/personal/scratch/t.py"}, "function": "<module>", "level": {"icon": "ℹ️", "name": "INFO", "no": 20}, "line": 40, "message": "successfully processed item", "module": "t", "name": "__main__", "process": {"id": 2011, "name": "MainProcess"}, "thread": {"id": 8352744000, "name": "MainThread"}, "time": {"repr": "2025-01-16 10:42:09.447466+02:00", "timestamp": 1737016929.447466}}}
{"text": "2025-01-16 10:42:09.447 | INFO     | __main__:<module>:34 - started processing item\n", "record": {"elapsed": {"repr": "0:00:00.006623", "seconds": 0.006623}, "exception": null, "extra": {"item_id": 2}, "file": {"name": "t.py", "path": "/Users/senpos/personal/scratch/t.py"}, "function": "<module>", "level": {"icon": "ℹ️", "name": "INFO", "no": 20}, "line": 34, "message": "started processing item", "module": "t", "name": "__main__", "process": {"id": 2011, "name": "MainProcess"}, "thread": {"id": 8352744000, "name": "MainThread"}, "time": {"repr": "2025-01-16 10:42:09.447496+02:00", "timestamp": 1737016929.447496}}}
{"text": "2025-01-16 10:42:09.447 | INFO     | __main__:<module>:40 - successfully processed item\n", "record": {"elapsed": {"repr": "0:00:00.006756", "seconds": 0.006756}, "exception": null, "extra": {"item_id": 2}, "file": {"name": "t.py", "path": "/Users/senpos/personal/scratch/t.py"}, "function": "<module>", "level": {"icon": "ℹ️", "name": "INFO", "no": 20}, "line": 40, "message": "successfully processed item", "module": "t", "name": "__main__", "process": {"id": 2011, "name": "MainProcess"}, "thread": {"id": 8352744000, "name": "MainThread"}, "time": {"repr": "2025-01-16 10:42:09.447629+02:00", "timestamp": 1737016929.447629}}}
{"text": "2025-01-16 10:42:09.447 | INFO     | __main__:<module>:34 - started processing item\n", "record": {"elapsed": {"repr": "0:00:00.006814", "seconds": 0.006814}, "exception": null, "extra": {"item_id": 3}, "file": {"name": "t.py", "path": "/Users/senpos/personal/scratch/t.py"}, "function": "<module>", "level": {"icon": "ℹ️", "name": "INFO", "no": 20}, "line": 34, "message": "started processing item", "module": "t", "name": "__main__", "process": {"id": 2011, "name": "MainProcess"}, "thread": {"id": 8352744000, "name": "MainThread"}, "time": {"repr": "2025-01-16 10:42:09.447687+02:00", "timestamp": 1737016929.447687}}}
{"text": "2025-01-16 10:42:09.447 | INFO     | __main__:<module>:37 - skip: bad item id\n", "record": {"elapsed": {"repr": "0:00:00.006847", "seconds": 0.006847}, "exception": null, "extra": {"item_id": 3}, "file": {"name": "t.py", "path": "/Users/senpos/personal/scratch/t.py"}, "function": "<module>", "level": {"icon": "ℹ️", "name": "INFO", "no": 20}, "line": 37, "message": "skip: bad item id", "module": "t", "name": "__main__", "process": {"id": 2011, "name": "MainProcess"}, "thread": {"id": 8352744000, "name": "MainThread"}, "time": {"repr": "2025-01-16 10:42:09.447720+02:00", "timestamp": 1737016929.44772}}}
{"text": "2025-01-16 10:42:09.447 | INFO     | __main__:<module>:34 - started processing item\n", "record": {"elapsed": {"repr": "0:00:00.006874", "seconds": 0.006874}, "exception": null, "extra": {"item_id": 4}, "file": {"name": "t.py", "path": "/Users/senpos/personal/scratch/t.py"}, "function": "<module>", "level": {"icon": "ℹ️", "name": "INFO", "no": 20}, "line": 34, "message": "started processing item", "module": "t", "name": "__main__", "process": {"id": 2011, "name": "MainProcess"}, "thread": {"id": 8352744000, "name": "MainThread"}, "time": {"repr": "2025-01-16 10:42:09.447747+02:00", "timestamp": 1737016929.447747}}}
{"text": "2025-01-16 10:42:09.447 | INFO     | __main__:<module>:40 - successfully processed item\n", "record": {"elapsed": {"repr": "0:00:00.006896", "seconds": 0.006896}, "exception": null, "extra": {"item_id": 4}, "file": {"name": "t.py", "path": "/Users/senpos/personal/scratch/t.py"}, "function": "<module>", "level": {"icon": "ℹ️", "name": "INFO", "no": 20}, "line": 40, "message": "successfully processed item", "module": "t", "name": "__main__", "process": {"id": 2011, "name": "MainProcess"}, "thread": {"id": 8352744000, "name": "MainThread"}, "time": {"repr": "2025-01-16 10:42:09.447769+02:00", "timestamp": 1737016929.447769}}}

And even then, it is extremely noisy, as most of the time I don't need serialized logs. What I wanted is to avoid prefixing each log call with [item_id={item_id}], so it is less typing + I don't forget to add it somewhere.

senpos avatar Jan 16 '25 08:01 senpos

For what it's worth, I agree that arguments put through contextualize should also be shown in the default logging alongside kwargs in the log calls - and to that end made a solution that implements this behaviour by using something like the following as my loguru configuration:

import loguru

def formatter(record: "loguru.Record") -> str:
    return "".join((
        "<green>{time}</green> ",
        "<level>{level:<8s}</level> [{file}:{line}] | <level>{message}</level> ",
        "<green>{extra}</green>" if record["extra"] else "",
        "\n{exception}",
    ))

loguru.logger.remove(0)
loguru.logger.add(sys.stdout, format=formatter, diagnose=True, backtrace=True, colorize=True)

Which outputs logging from the following example similar to that in the screenshot

filenames: list[str] = [f"file_{i}.txt" for i in range(5)]
log.info("Doing something with files")
for filename in filenames:
    with log.contextualize(filename=filename):
        log.debug(f"Loading {filename}")
    	log.info("Loaded file successfully")
Image

Just adding here in case it proves helpful anyone else!

devsjc avatar Feb 07 '25 17:02 devsjc

Thanks for the additional use cases.

It's more likely that one would want to output the contextualize() values than not. Therefore, it could indeed become the default.

I still think contextualize() is closer to bind() than **kwargs conceptually, so it would likely appear somewhere before the log message.

Delgan avatar Feb 12 '25 19:02 Delgan