cpython icon indicating copy to clipboard operation
cpython copied to clipboard

test_input_tty hangs when run multiple times in the same process on macOS 10.15

Open ambv opened this issue 3 years ago • 13 comments

BPO 44887
Nosy @vstinner, @ambv, @Fidget-Spinner, @akulakov

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-08-11.10:04:26.450>
labels = ['3.8', '3.9', '3.10', '3.11']
title = 'test_input_tty hangs when run multiple times in the same process on macOS 10.15'
updated_at = <Date 2021-11-13.00:49:20.228>
user = 'https://github.com/ambv'

bugs.python.org fields:

activity = <Date 2021-11-13.00:49:20.228>
actor = 'andrei.avk'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = []
creation = <Date 2021-08-11.10:04:26.450>
creator = 'lukasz.langa'
dependencies = []
files = []
hgrepos = []
issue_num = 44887
keywords = []
message_count = 10.0
messages = ['399380', '399382', '399384', '399385', '399396', '399398', '399412', '399413', '400631', '406262']
nosy_count = 4.0
nosy_names = ['vstinner', 'lukasz.langa', 'kj', 'andrei.avk']
pr_nums = []
priority = 'low'
resolution = None
stage = 'test needed'
status = 'open'
superseder = None
type = None
url = 'https://bugs.python.org/issue44887'
versions = ['Python 3.8', 'Python 3.9', 'Python 3.10', 'Python 3.11']

ambv avatar Aug 11 '21 10:08 ambv

(I'm still investigating at the moment whether something changed in my environment.)

Running the following right now hangs on test_input_tty for me:

./python.exe -m test test_builtin test_builtin -v

This fails on all branches up to and including 3.7, so I assume this is environment-specific unless it's a regression due to a change that was backported all the way back to 3.7, which is out of the question as the last functional commit on 3.7 was back in June.

Things I tried so far:

  • rebooting;
  • using another terminal app (I use iTerm2 by default, tried Terminal.app too);
  • another shell (I use fish by default, tried bash 5.0 as well);
  • a non-pydebug build (I use pydebug builds by default to run -R:)

The test in question is using deadline if available and sysconfig.get_config_vars()['HAVE_LIBREADLINE'] returns 1. I'll be trying to check if that works for me next.

ambv avatar Aug 11 '21 10:08 ambv

Hynek confirmed on Big Sur with Python 3.9.5 from asdf that test_input_tty hangs, too, if ran for the second time in the same process.

Moreover, readline is not it. First of all, it's libedit on macOS:

❯ ll /usr/lib/libreadline.dylib lrwxr-xr-x 1 root wheel 15B Feb 2 2020 /usr/lib/libreadline.dylib -> libedit.3.dylib

So Python uses that by default:
>>> import readline
>>> readline._READLINE_LIBRARY_VERSION
'EditLine wrapper'
>>> readline._READLINE_RUNTIME_VERSION
1026
>>> readline._READLINE_VERSION
1026


Unless you instruct it to use readline (for example by providing "-I$(brew --prefix readline)/include" to CFLAGS and "-L$(brew --prefix readline)/lib" to LDFLAGS before running ./configure):
>>> import readline
>>> readline._READLINE_LIBRARY_VERSION
'8.1'
>>> readline._READLINE_RUNTIME_VERSION
2049
>>> readline._READLINE_VERSION
2049

The hang is the same in both cases.

Next course of action, checking if it's not due to fork shenanigans in _run_child():

https://github.com/python/cpython/blob/1841c70f2bdab9d29c1c74a8afffa45d5555af98/Lib/test/test_builtin.py#L2001

ambv avatar Aug 11 '21 10:08 ambv

Parent process hangs on:

  • thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    • frame #0: 0x00007fff6741181e libsystem_kernel.dylibread + 10 frame #1: 0x000000010226a117 python.exe_Py_read(fd=3, buf=0x00007f8d24009840, count=8192) at fileutils.c:1744:13 frame #2: 0x00000001022f1335 python.exe_io_FileIO_readinto_impl(self=0x0000000103b284d0, buffer=0x00007ffeedcbe928) at fileio.c:645:9 frame #3: 0x00000001022f063e python.exe_io_FileIO_readinto(self=0x0000000103b284d0, arg=0x0000000102f5d090) at fileio.c.h:205:20 frame #4: 0x00000001020050e9 python.exemethod_vectorcall_O(func=0x00000001026fd970, args=0x00007ffeedcbeaf0, nargsf=2, kwnames=0x0000000000000000) at descrobject.c:462:24 frame #5: 0x0000000101ff323d python.exe_PyObject_VectorcallTstate(tstate=0x00007f8d20d04f00, callable=0x00000001026fd970, args=0x00007ffeedcbeaf0, nargsf=2, kwnames=0x0000000000000000) at abstract.h:114:11 frame #6: 0x0000000101ff30c9 python.exePyObject_VectorcallMethod(name=0x00000001026fcbe0, args=0x00007ffeedcbeaf0, nargsf=2, kwnames=0x0000000000000000) at call.c:770:24 frame #7: 0x00000001022f92a0 python.exePyObject_CallMethodOneArg(self=0x0000000103b284d0, name=0x00000001026fcbe0, arg=0x0000000102f5d090) at abstract.h:204:12

where "name" in frame #7 is the "readinto" method of <_io.FileIO name=3 mode='rb' closefd=True> and "arg" is <memory at 0x102f5d090>.

Child process hangs on:

  • thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    • frame #0: 0x00007fff67413bf6 libsystem_kernel.dylibwrite + 10 frame #1: 0x000000010226a3e0 python.exe_Py_write_impl(fd=2, buf=0x00007ffeedcbcbcb, count=1, gil_held=0) at fileutils.c:1813:17 frame #2: 0x000000010226a535 python.exe_Py_write_noraise(fd=2, buf=0x00007ffeedcbcbcb, count=1) at fileutils.c:1871:12 frame #3: 0x0000000102257834 python.exe_Py_DumpASCII(fd=2, text=0x0000000102a1bed0) at traceback.c:1002:13 frame #4: 0x0000000102258ba5 python.exedump_frame(fd=2, frame=0x00000001025dbba8) at traceback.c:1035:9 frame #5: 0x00000001022579fa python.exedump_traceback(fd=2, tstate=0x00007f8d20d04f00, write_header=0) at traceback.c:1084:9 frame #6: 0x0000000102257bc6 python.exe_Py_DumpTracebackThreads(fd=2, interp=0x00007f8d2281b010, current_tstate=0x00007f8d20d04f00) at traceback.c:1186:9 frame #7: 0x0000000102311dc3 python.exefaulthandler_dump_traceback(fd=2, all_threads=1, interp=0x00007f8d2281b010) at faulthandler.c:245:15 frame #8: 0x000000010231224b python.exefaulthandler_user(signum=14) at faulthandler.c:843:5 frame #9: 0x00007fff674c85fd libsystem_platform.dylib_sigtramp + 29 frame #10: 0x00007fff6741435f libsystem_kernel.dylib__ioctl + 11 frame #11: 0x00007fff6741434b libsystem_kernel.dylibioctl + 150 frame #12: 0x00007fff6734ad63 libsystem_c.dylibtcsetattr + 111 frame #13: 0x0000000103c772ee libreadline.8.dylib_set_tty_settings + 28 frame #14: 0x0000000103c76d87 libreadline.8.dylibrl_prep_terminal + 683 frame #15: 0x0000000103c88ce9 libreadline.8.dylib_rl_callback_newline + 51 frame #16: 0x0000000103c5ae75 readline.cpython-311d-darwin.soreadline_until_enter_or_signal(prompt="prompt", signal=0x00007ffeedcbd63c) at readline.c:1318:5 frame #17: 0x0000000103c58637 readline.cpython-311d-darwin.socall_readline(sys_stdin=0x00007fff8d9c8d90, sys_stdout=0x00007fff8d9c8e28, prompt="prompt") at readline.c:1396:9 frame #18: 0x0000000101fad9b6 python.exePyOS_Readline(sys_stdin=0x00007fff8d9c8d90, sys_stdout=0x00007fff8d9c8e28, prompt="prompt") at myreadline.c:391:14 frame #19: 0x00000001021a071d python.exebuiltin_input_impl(module=0x000000010268c0b0, prompt=0x00000001027a0400) at bltinmodule.c:2188:13

ambv avatar Aug 11 '21 11:08 ambv

This might be a long-standing problem. I haven't encountered it before because I was always running -R: with -j and in this case the test is skipped:

test_input_tty (test.test_builtin.PtyTests) ... skipped 'stdin and stdout must be ttys'

ambv avatar Aug 11 '21 11:08 ambv

Amazingly, excluding every other test function with a bunch of -i patterns still makes it hang when ran twice. On the other hand, only including the test function with -m works fine.

This is very weird. Looking further.

Semi-relatedly, I found BPO-26228, could reproduce it, and finished an open PR on it. While those are separate issues, I'm hoping to solve them both.

ambv avatar Aug 11 '21 14:08 ambv

I found the high-level reason why test_builtin hangs: it runs doctests as well. What's the root cause? I don't know yet.

But to confirm, I can also hang the tests by running:

$ python3.9 -m test test_doctest test_builtin -v

Now to discover what it is that doctest does...

ambv avatar Aug 11 '21 15:08 ambv

The doctest runner sets an output redirecting debugger, which subclasses Pdb, around actually running the doctest. This action causes the hang. New finding, we can hang the test with test_pdb too:

$ python3.9 -m test test_pdb test_builtin -v

ambv avatar Aug 11 '21 18:08 ambv

It *is* readline-related after all O_O

Commenting out this section in Pdb.__init__ makes the issue go away: https://github.com/python/cpython/blob/64a7812c170f5d46ef16a1517afddc7cd92c5240/Lib/pdb.py#L234-L239

time ./python.exe -E -Wd -m test test_builtin test_builtin 0:00:00 load avg: 2.12 Run tests sequentially 0:00:00 load avg: 2.12 [1/2] test_builtin 0:00:00 load avg: 2.12 [2/2] test_builtin

== Tests result: SUCCESS ==

All 2 tests OK.

Total duration: 1.3 sec Tests result: SUCCESS 1.56 real 1.42 user 0.10 sys

I'll be continuing on this tomorrow to find the root cause.

ambv avatar Aug 11 '21 19:08 ambv

Is it related to https://bugs.python.org/issue41034 ?

vstinner avatar Aug 30 '21 16:08 vstinner

I've looked into this and the hang happens on this line:

https://github.com/python/cpython/blob/de3db1448b1b983eeb9f4498d07e3d2f1fb6d29d/Lib/test/test_builtin.py#L2030

So the issue is that on the second run, there's nothing to read on that fd. I've tried using os.stat to check if there's data on the fd, but it returned 0 data in both 1st and 2nd runs.

However, if a small sleep is added before running os.stat, it does return size of data on 1st run and returns 0 on 2nd run, meaning it's possible to avoid the hang and error out instead (is that an improvement?)

This is on MacOS 11.4 Big Sur by the way.

This is my test debug branch:

https://github.com/python/cpython/compare/main...akulakov:Test-check_input_tty-FIX?expand=1

akulakov avatar Nov 13 '21 00:11 akulakov

Still seen on Ventura 13.5.2 in Python 3.13 (i.e., the main branch). Would it be possible to get fix?

gvanrossum avatar Nov 15 '23 03:11 gvanrossum

test_builtin.test_input_tty() calls forkpty(), whereas recent macOS versions are known to have issues with fork().

forkpty() manual page:

The forkpty() function combines openpty(), fork(2), and login_tty() to create a new process operating in a pseudoterminal.

Maybe the test should be rewritten with openpty() and login_tty(), but replace fork() with subprocess.Popen()?

glibc implementation:

int
__forkpty (int *pptmx, char *name, const struct termios *termp,
	   const struct winsize *winp)
{
  int ptmx, terminal, pid;

  if (openpty (&ptmx, &terminal, name, termp, winp) == -1)
    return -1;

  switch (pid = __fork ())
    {
    case -1:
      __close (ptmx);
      __close (terminal);
      return -1;

    case 0:
      /* Child.  */
      __close (ptmx);
      if (login_tty (terminal))
	_exit (1);

      return 0;

    default:
      /* Parent.  */
      *pptmx = ptmx;
      __close (terminal);

      return pid;
    }
}

vstinner avatar Nov 15 '23 03:11 vstinner

It happens for me now every time I run ./python.exe -m test test_builtin -v

test_input_no_stdout_fileno (test.test_builtin.PtyTests.test_input_no_stdout_fileno) ... ok
test_input_tty (test.test_builtin.PtyTests.test_input_tty) ... ^C

== Tests result: INTERRUPTED ==

1 test omitted:
    test_builtin

Test suite interrupted by signal SIGINT.

Total duration: 37.5 sec
Total tests: run=0
Total test files: run=0/1
Result: INTERRUPTED

Even ./python.exe -m test test_builtin -v -m test_input_tty hangs:

» ./python.exe -m test test_builtin -v -m test_input_tty
== CPython 3.14.0a0 (heads/readydefault:4cb2d40830f, May 10 2024, 15:38:37) [Clang 15.0.0 (clang-1500.3.9.4)]
== macOS-14.4.1-arm64-arm-64bit-Mach-O little-endian
== Python build: debug
== cwd: /Users/sobolev/Desktop/cpython2/build/test_python_worker_94096æ
== CPU count: 12
== encodings: locale=UTF-8 FS=utf-8
== resources: all test resources are disabled, use -u option to unskip tests

Using random seed: 3295473582
0:00:00 load avg: 1.55 Run 1 test sequentially
0:00:00 load avg: 1.55 [1/1] test_builtin
test_input_tty (test.test_builtin.PtyTests.test_input_tty) ... 

😢

Is it related to a new REPL?

sobolevn avatar May 10 '24 13:05 sobolevn