modernize
modernize copied to clipboard
--future-unicode breaks handling of unicode() calls.
$ cat t.py
print(unicode("Hello World"))
$ python-modernize t.py
-print(unicode("Hello World"))
+from __future__ import print_function
+from __future__ import absolute_import
+import six
+print((six.text_type("Hello World")))
$ python-modernize --future-unicode t.py
-print(unicode("Hello World"))
+from __future__ import print_function
+from __future__ import unicode_literals
+print((str("Hello World")))
Sadly, this translation breaks for anything requiring an unicode object in Python 2.x
Same thing happens to unichr and chr => it seems like modernize "thinks" that unicode_literals change the semantics of chr/str to unichr/unicode in python 2 which it obviously does not.
Interestingly 2to3 also changes unicode to str.
Seems to be related that fix_unicode_future.py derives from lib2to3.fixes.fix_unicode
Yes, the --future-unicode option should probably enable a fix with a higher priority that will check for unicode being called on a string literal, and eliminate the call before it gets rewritten to str (i.e. unicode('foo') --> 'foo'.
Do you want to have a go at this? There's plenty of examples of fixes and tests to copy from.
Do you think it would be better to go raising the priority of libmodernize.fixes.fix_unicode_type (or maybe some other fixers), or just lower that of libmodernize.fixes.fix_unicode_future ?
For future reference, changing the priority is done by adding a run_order member on the Fixer class; the default value is 5, smaller numbers are higher priorities. See https://github.com/python/cpython/blob/master/Lib/lib2to3/fixer_base.py#L33