python-logfmter icon indicating copy to clipboard operation
python-logfmter copied to clipboard

feat: add logfmt parser

Open spaceone opened this issue 3 months ago • 11 comments

Fixes: #28

Depends on #49 and #50

spaceone avatar Sep 24 '25 10:09 spaceone

It would be good to have a parametrized round-trip test - that is, take a (valid) dict of strings to strings, logfmt it, then parse_logfmt, and check that the dict you get back is the dict you put in. One could then construct variously unpleasant dictionaries and verify that they came through all right. One could also add a check that invalid dictionaries that didn't make sense in logfmt (empty string as key?) raise a sensible exception.

td-anne avatar Sep 24 '25 11:09 td-anne

My test already compares the format with the parsed result.

spaceone avatar Sep 24 '25 11:09 spaceone

My test already compares the format with the parsed result.

Indeed. But round-trip tests are a verification of both parts of the system, and are much easier to construct difficult examples for.

td-anne avatar Sep 24 '25 12:09 td-anne

I can easily split the tests case and make the round trip: assert parse_logfmt(Logfmter.format_params(expected), **kwargs) == expected

but I currently already also verify the exact resulting string. And then I need at least 2 test cases - because there are many cases which are valid but aren't the canonical form logfmter would produce. E.g. at="INFO" is equally valid to at=INFO.

spaceone avatar Sep 24 '25 12:09 spaceone

@td-anne I added two more additional round trip tests

spaceone avatar Sep 24 '25 12:09 spaceone

@spaceone once this is merged with latest (and conflicts resolved), it should pass the tests (since newline update has been merged)

josheppinette avatar Sep 24 '25 22:09 josheppinette

And then I need at least 2 test cases - because there are many cases which are valid but aren't the canonical form logfmter would produce. E.g. at="INFO" is equally valid to at=INFO.

If we really wanted this, then we would probably have some normalization before comparing the strings, but I don't think its necessary (at least for initial rollout).

josheppinette avatar Sep 24 '25 22:09 josheppinette

@josheppinette I rebased it. But I don't think it's complete. We should speak about some behavior:

  1. Should we allow ' single quotes? logfmt only produces double quotes. I disabled them for now. go-logfmt declined a PR to add single quote support as well.
  2. Strings can come from user input, so they are arbitrary. How would you expect a = b to be parsed?
$ echo 'a = b' | ./external/golang-logfmt-echo/golang-logfmt-echo 
error: logfmt syntax error at pos 3 on line 1: unexpected '='

I can't easily detect this case.

  1. The current implementation doesn't put nested keys back into a dictionary structure. But doing so, is unsafe.
  2. maybe more things to consider.

spaceone avatar Sep 24 '25 22:09 spaceone

  1. I am not sure about compatibility with unquoted backslash sequences. The external compatibility test fails due to my demo change in 72e906065ef3b500f8f739fba48f06713f9cf62e. For example: logfmt currently produces:
>>> logfmter.Logfmter.format_params({'a': '\\', 'b': 'c'})
'a=\\ b=c'

Similar to the newline issue, I would have expected:

>>> logfmter.Logfmter.format_params({'a': '\\', 'b': 'c'})
'a="\\\\" b=c'

Would you consider this change as valid?

spaceone avatar Sep 24 '25 23:09 spaceone

$ echo 'foo="\\" bar=baz' | ./golang-logfmt-echo 
foo=\ bar=baz

spaceone avatar Sep 24 '25 23:09 spaceone

→ https://github.com/go-logfmt/logfmt/issues/16

spaceone avatar Sep 24 '25 23:09 spaceone