paste
paste copied to clipboard
Fix UnicodeDecodeError for Umlaute in translogger
Fix for https://github.com/cdent/paste/issues/73 - do not use %b
use %m
instead.
Unfortunately this fix introduces a potential compatibility issue that we need to think about before being certain we want to merge it:
If any external tooling is parsing the log message and expecting a month name, replacing it with numbers will break that parsing.
Ideas on how to deal with that?
cc @FND as they might have some insight
Ouch, that seems tricky indeed...
I don't really know much about Paste's user base, nor can I quite imagine why someone would want localized logs (fully aware that's somewhat ignorant). The fact that Paste's relationship with developers is often indirect (AIUI, they might think they're using Plone, with Paste working under the covers) makes this even harder to assess.
I suppose the only safe option is to make this configurable: Given that Paste is in maintenance mode, changing behavior seems ill-advised - that leaves the opt-in route: If this was a TransLogger
constructor argument, at least users (including frameworks) would have the option of fixing this issue on their end, where they have more context to make a decision.
I completely agree that localized logs are unnecessary. However, when running the program on MacOS, the logs are translated by default - perhaps due to my messy setup. It would be helpful to have an optional configuration for this feature. Additionally, using unicode/UTF-8 strings would be beneficial for compatibility - even with outdated versions of Python 2.
@2silver are you happy to adjust the patch to have Translogger take an argument, something like numeric_month=False
that if true uses %m
.
My preference is to keep the extent of changes as small as possible, and an optional kwarg is probably the easiest way to do that.
Any reason not to make the entire timestamp configurable while you're at it, e.g. timestamp='%d/%b/%Y:%H:%M:%S'
?
Any reason not to make the entire timestamp configurable while you're at it, e.g. timestamp='%d/%b/%Y:%H:%M:%S'?
That's a good idea.
Which means we end up with two options on how to do this:
- as a passed in kwarg as defined
- as a class variable which can be overridden in a subclass (the issue with the current code is that it is hard to easily subclass for changed behaviour: the existing method is doing too much)
I tried it with this commit https://github.com/cdent/paste/pull/74/commits/0db6e41f7a2f6798168c591dfcd50152e0476ca3
Sorry for not getting back to you sooner about this. I've unfortunately not yet had time to give this a proper review. A quick glance shows that the changes a conceptually okay, but some of the details of the Python code need some work. Depending on your preference I can help you to push it in the right direction, once I get time to give it a proper review, or I can take the work you've started and change it myself. What's your preference?
My apologies or taking such a very long time to look at this. I've gone ahead and don't a different revision in #80 . If you could try that and let me know if it works for your purposes I'll go ahead and merge it and make a new release.
The concepts in my implementation are the same as yours, but done in a simpler fashion.