Unicode encoding error
Fixing an issue with Pluggy where surrogate escape characters are not encoded correctly, causing a crash of Pluggy hooks. From issue 13750 in Pytest.
Why is codecov failing for us following this change in pluggy/src/pluggy/_tracing.py? Thanks!
Thank you for the PR @Ghanchu !
Before going forward with this
https://github.com/pytest-dev/pytest/blob/main/src%2F_pytest%2Fhelpconfig.py#L129 the debigfile should perhsps open in surrogate escape mode
Before going forward with this
https://github.com/pytest-dev/pytest/blob/main/src%2F_pytest%2Fhelpconfig.py#L129 the debigfile should perhsps open in surrogate escape mode
We tried implementing the surrogate escape mode for the debug file, however it did not fix the original problem in the issue, nor the unicode surrogate tests in test_tracer.py. It only affects the debug file writer, not other writers (like lists, stdout, or internal pytest hooks). So if any other part of pytest writes a message containing a surrogate, you still get a UnicodeEncodeError.
@RonnyPfannschmidt I don't understand, why surrogate escapes? Aren't those intended where you want to "mangle" invalid characters to put them somewhere, where later the same application (or a different one) can "unmangle" them again to get the originals back?
This is a debug log, so the last thing that should be happening IMHO is "weird unicode things" (which, say, break a hook) being represented in an even more confusing way, no?
That being said, I also don't think replace is a good idea here, I think the most (only?) sensible thing for a debug things is using backslashescape.
That all being said, why does _format_message use str for formatting the hook (?) arguments in the first place? Wouldn't be using repr() way more appropriate for a debug log, and fix that problem too?
Its using str because its the log message Indeed backslash escape is a better choice
As for dealing with invalid surogates in general i prefer ignoring them everywhere but output encoding for files
Its using str because its the log message
Huh? I mean here, where pluggy formats hook debug output (Python values!) into the string that should be written to the debug log:
https://github.com/pytest-dev/pluggy/blob/5b24a4df70d963ee2dc92f663d3ac83155ba0bad/src/pluggy/_tracing.py#L38
This results in a log file like this:
pytest_plugin_registered [hook]
plugin: <_pytest.cacheprovider.LFPlugin object at 0x7f0740af6510>
plugin_name: lfplugin # !!
manager: <_pytest.config.PytestPluginManager object at 0x7f0741ac52b0>
finish pytest_plugin_registered --> [] [hook]
or
pytest_report_header [hook]
config: <_pytest.config.Config object at 0x7f0741ac7770>
start_path: /home/.../proj/pytest # !!
startdir: /home/.../proj/pytest # !!
I'm arguing that this is the wrong thing to do for a debug log, and this should be {value!r} instead:
pytest_plugin_registered [hook]
plugin: <_pytest.cacheprovider.LFPlugin object at 0x7f1864b90980>
plugin_name: 'lfplugin' # !!
manager: <_pytest.config.PytestPluginManager object at 0x7f18657370e0>
or
pytest_report_header [hook]
config: <_pytest.config.Config object at 0x7f1864efa270>
start_path: PosixPath('/home/.../proj/pytest') # !!
startdir: local('/home/.../proj/pytest') # !!
That seems more appropriate for a debug output (now you can actually see that start_path and startdir aren't the same type as plugin_name in the other message!), and would fix this issue as well.
Ah, I spoke to soon: That alone doesn't fix this, but also using repr(...) here for the return value does:
https://github.com/pytest-dev/pluggy/blob/5b24a4df70d963ee2dc92f663d3ac83155ba0bad/src/pluggy/_manager.py#L504
That means it gets logged as finish pytest_fixture_setup --> '\ud800' [hook] rather than trying to put a literal unicode character (mangled or not) into the debug log file.
Good find
Those repr placements are way more effective
We made the changes using repr instead of replace. This required updating the test cases because repr() converts strings to Python literals, which adds quotes and escapes special characters. Also, would using repr cause similar performance issues to using replace? Should we use try except: try normal string handling first and then use repr when an encoding error occurs, just to improve performance?
As for anything related to performance, reality can be counter intuitive and nothing is certain until a benchmark is done ;) (which don't have to be super elaborated in this case)
As for anything related to performance, reality can be counter intuitive and nothing is certain until a benchmark is done ;) (which don't have to be super elaborated in this case)
We went ahead an benchmarked three approaches: backslashreplace, repr, and the original implementation while assuming roughly 1% of inputs contain surrogate escape characters. In practice, the performance differences between them were negligible.
We can share the benchmark setup and results here if that would be helpful.
Thank you for checking. Yes, can you share how you did the benchmark, please ?
Yes, we used pytest-benchmark and tested the format message function by creating a list of 100 strings, including one string with a surrogate escape character. We then gave these strings as input to format message. The input size was linear in time for every instance.
random_words = [
"f3lta", "xorp9", "jungle42", "qwerty", "plmokn", "vex99", "blitz7", "zappy", "frodo1", "snark",
"gizmo", "tr4ce", "whisp3r", "dynamo", "flux88", "sparkle", "boggle", "mirth5", "quasar", "fiddle",
"pixel", "blorp", "whisk9", "tango", "jumble3", "nifty", "krypton", "buzz7", "glimmer", "zoodle",
"snazzy", "cog42", "blip", "jolt8", "fumble", "vortex", "tweak9", "spork", "fizz3", "glint",
"moxie", "quibble", "zap99", "fluff", "sprout7", "drift", "twizzle", "caper", "plucky", "frizzle",
"zing", "whirly", "snip3", "flame", "blabber", "crux9", "froth", "spangle", "glitch", "twang7",
"jitter", "mingle", "puzzle3", "flicker", "bramble", "sprocket", "twitch9", "glimpse", "sizzle",
"puddle", "quark7", "drizzle", "spark", "muddle", "twirl3", "flapper", "boggle9", "shimmer", "plop",
"frump", "dazzle", "zippity", "cackle3", "fritter", "whistle", "snuggle", "tangle7", "fizzle",
"glimmer9", "piffle", "snort", "twinkle", "frizzle3", "muffin", "claptrap", "bluster", "zing3",
"whomp", "crackle", "swoosh7", "flipper", "nibble", "twizzle3", "guffaw", "plonk", "snicker",
"\ud800",
]
@pytest.fixture(params=[1, 10, 1000, 10000])
def messages(request):
return random_words * request.param
def test_format_message_speed(benchmark, messages):
tracer = TagTracer()
def call_messages():
for msg in messages:
tracer._format_message(["pytest"], [msg])
benchmark(call_messages)
I also thought that maybe performance was similar across different implementations of format_message because you had to loop through an array every single time, so I made a benchmark that only tested one string for the format_message function for each different implementation and performance was still very similar.