ipykernel
ipykernel copied to clipboard
Run with Python UTF-8 mode enabled?
There's a proposal for Python to use UTF-8 mode on Windows by default. Most significantly, this means that open()-ing a file in text mode would use UTF-8 by default rather than a locale-dependent encoding.
This is controversial, and one step that has been suggested is for Jupyter to enable UTF-8 mode for kernels before a change is made in Python, to explore how well it works. Changing the default is arguably also more favourable to new programmers who don't have a lot of existing code, and Jupyter is often used in teaching settings.
I'm strongly in favour of changing the default in Python, but I'd be -0.5 on only changing it in Jupyter, because that would mean code inside Jupyter behaves differently from code outside. But I'd be more in favour if changing it in Jupyter was a prelude to, or test case for, changing it in Python.
Jupyter assumes UTF-8 in a number of places (the message protocol and ipynb file format, at least), so I'd support trying this. Not 100% sure, though, because as you said, behaving like Python normally does is a key goal. But if it's as a preview of a Python feature, I'd be okay with it as a sort of early-adopter feature, with the caveat that if the Python proposal is ultimately rejected, we revert the behavior in Jupyter as well.
UTF-8 mode is exist since Python 3.7. We can use it by setting PYTHONUTF8=1 in .envrc, or python3 -Xutf8 myapp.py already.
Since Python providing opt-in setting already, adding an option to Jupyter doesn't means change Jupyter before Python, unless it become default.
Although I think UTF-8 mode is very nice for data cscientists and students learning Python on Windows, we (Python core developers) can not change the default setting because there are many legacy Python applications in the wild.
I want to make UTF-8 mode accessible for Python usres who don't use command-line and don't know environment variables. Please add an option which can be configured in GUI.
Hello @methane is there any difference between using -Xutf8 and -X utf8 (with space after -X)? Official docs say -X utf8: https://docs.python.org/3/using/cmdline.html#cmdoption-X
No difference. Both are OK.
$ python3 -Xutf8 -c 'import sys; print(sys.flags.utf8_mode)'
1
$ python3 -X utf8 -c 'import sys; print(sys.flags.utf8_mode)'
1
FWI, we are discussing about making the UTF-8 Mode default again. https://peps.python.org/pep-0686/
It is good for most Python users, except who need to interact with many legacy command line applications or text files encoded in legacy encodings.
PEP 686 has been accepted. UTF-8 mode will become default at Python 3.15.
Providing a way to opt-in UTF-8 mode easily would be helpful for many Windows users until Python 3.15.
When I cmake a project in Ubuntu, I got an error following: cmake -E env unknown option '-Xutf8'.