ipykernel icon indicating copy to clipboard operation
ipykernel copied to clipboard

Run with Python UTF-8 mode enabled?

Open takluyver opened this issue 5 years ago • 6 comments

There's a proposal for Python to use UTF-8 mode on Windows by default. Most significantly, this means that open()-ing a file in text mode would use UTF-8 by default rather than a locale-dependent encoding.

This is controversial, and one step that has been suggested is for Jupyter to enable UTF-8 mode for kernels before a change is made in Python, to explore how well it works. Changing the default is arguably also more favourable to new programmers who don't have a lot of existing code, and Jupyter is often used in teaching settings.

I'm strongly in favour of changing the default in Python, but I'd be -0.5 on only changing it in Jupyter, because that would mean code inside Jupyter behaves differently from code outside. But I'd be more in favour if changing it in Jupyter was a prelude to, or test case for, changing it in Python.

takluyver avatar Feb 10 '20 10:02 takluyver

Jupyter assumes UTF-8 in a number of places (the message protocol and ipynb file format, at least), so I'd support trying this. Not 100% sure, though, because as you said, behaving like Python normally does is a key goal. But if it's as a preview of a Python feature, I'd be okay with it as a sort of early-adopter feature, with the caveat that if the Python proposal is ultimately rejected, we revert the behavior in Jupyter as well.

minrk avatar Feb 13 '20 15:02 minrk

UTF-8 mode is exist since Python 3.7. We can use it by setting PYTHONUTF8=1 in .envrc, or python3 -Xutf8 myapp.py already.

Since Python providing opt-in setting already, adding an option to Jupyter doesn't means change Jupyter before Python, unless it become default.

Although I think UTF-8 mode is very nice for data cscientists and students learning Python on Windows, we (Python core developers) can not change the default setting because there are many legacy Python applications in the wild.

I want to make UTF-8 mode accessible for Python usres who don't use command-line and don't know environment variables. Please add an option which can be configured in GUI.

methane avatar Mar 06 '21 00:03 methane

Hello @methane is there any difference between using -Xutf8 and -X utf8 (with space after -X)? Official docs say -X utf8: https://docs.python.org/3/using/cmdline.html#cmdoption-X

rafrafek avatar Apr 11 '22 09:04 rafrafek

No difference. Both are OK.

$ python3 -Xutf8 -c 'import sys; print(sys.flags.utf8_mode)'
1

$ python3 -X utf8 -c 'import sys; print(sys.flags.utf8_mode)'
1

FWI, we are discussing about making the UTF-8 Mode default again. https://peps.python.org/pep-0686/

It is good for most Python users, except who need to interact with many legacy command line applications or text files encoded in legacy encodings.

methane avatar Apr 11 '22 10:04 methane

PEP 686 has been accepted. UTF-8 mode will become default at Python 3.15.

Providing a way to opt-in UTF-8 mode easily would be helpful for many Windows users until Python 3.15.

methane avatar Jun 28 '22 01:06 methane

When I cmake a project in Ubuntu, I got an error following: cmake -E env unknown option '-Xutf8'.

WilsonRed avatar Apr 03 '24 02:04 WilsonRed