cpython icon indicating copy to clipboard operation
cpython copied to clipboard

bpo-32608: Adding a server into socketserver that handles client connection in new "multiprocessing.Process" processes.

Open deliberist opened this issue 7 years ago • 11 comments

https://bugs.python.org/issue32608

deliberist avatar Jan 21 '18 13:01 deliberist

Hello, and thanks for your contribution!

I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).

Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.

Thanks again to your contribution and we look forward to looking at it!

the-knights-who-say-ni avatar Jan 21 '18 13:01 the-knights-who-say-ni

I signed the CLA yesterday, does it take a few days for that to be seen?

deliberist avatar Jan 21 '18 13:01 deliberist

For my own sanity, I'm dropping some useful commands I used when building/testing these changes.

Linux builds:

cd cpython
./configure --with-pydebug
make -j
./python -m test.pythoninfo
./python -m test
./python -m test -v test.test_socketserver -uall

Windows builds:

cd cpython
.\PCbuild\clean.bat
.\PCbuild\build.bat -e
.\PCbuild\win32\python.exe -m test.pythoninfo
.\PCbuild\win32\python.exe -m test
.\PCbuild\win32\python.exe -m test -v test.test_socketserver -uall
.\PCbuild\rt.bat -q -uall -u-cpu -rwW --slowest --timeout=1200 --fail-env-changed -j0

deliberist avatar Feb 10 '18 18:02 deliberist

Can someone with more Python on Windows experience be able to explain to me why the AppVeyor build is failing?

To me, it looks like the AppVeyor is failing on the test_threading module (which I did not modify) and the test_socketserver (which I did modify). However, I do not understand the circumstances where the test is failing on Windows but not on my LinuxMint 18.1 dev machine.

I would like this pull-request to get merged as soon as we can, but I doubt it will make it much farther with one of the build checks failing.

-- Thanks

deliberist avatar Feb 10 '18 19:02 deliberist

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

bedevere-bot avatar May 01 '18 21:05 bedevere-bot

Hi Antoine (pitrou). I've made the changes you suggested. However the CI builds fail on Windows because of dangling threads that I cannot seem to track down why. Do you have a few minutes have a second set of eyes looking at the errors?

Locally, I print a traceback in threading.py when a new thread is added to the threading._dangling WeakSet. This tells me where the code is when it creates the threads that are still dangling at the end of the test_socketserver tests.

I added this into threading.py, in Thread.__init__():

        import sys
        import traceback
        print()
        print('\tNAME = {}'.format(self._name))
        traceback.print_stack(file=sys.stdout)

This bug is close to being complete, I cannot figure out why the Process objects are creating these extra threads that are not being cleaned up.

-- Thanks!

deliberist avatar May 06 '18 21:05 deliberist

I have looked at the code over and over again. I keep coming to the conclusion that the reason the AppVeyor tests are failing only on Windows is because there is a problem with the way Windows handles multiple processes. For some reason, on Windows there is a dangling thread laying around after the unit tests run. I just cannot find exactly where and why a thread gets created but never cleaned up in the Thread._dangling object.

This scenario does not happen on Linux systems, which leads me to believe this problem is a bit out of scope of this ticket.

Can someone look and confirm? I'd really like to close this ticket so it can be merged into the main baseline, however the failing tests on Windows seems to be preventing this ticket from advancing.

deliberist avatar Jun 07 '18 21:06 deliberist

AppVeyor spotted a dangling thread:

0:03:19 [286/416/2] test_socketserver failed (env changed)
test_forking_handled (test.test_socketserver.ErrorHandlerTest) ... skipped 'requires forking'
test_forking_not_handled (test.test_socketserver.ErrorHandlerTest) ... skipped 'requires forking'
test_sync_handled (test.test_socketserver.ErrorHandlerTest) ... ok
test_sync_not_handled (test.test_socketserver.ErrorHandlerTest) ... ok
test_threading_handled (test.test_socketserver.ErrorHandlerTest) ... ok
test_threading_not_handled (test.test_socketserver.ErrorHandlerTest) ... ok
test_all (test.test_socketserver.MiscTestCase) ... ok
test_shutdown_request_called_if_verify_request_false (test.test_socketserver.MiscTestCase) ... ok
test_ForkingTCPServer (test.test_socketserver.SocketServerTest) ... skipped 'requires forking'
test_ForkingUDPServer (test.test_socketserver.SocketServerTest) ... skipped 'requires forking'
test_ForkingUnixDatagramServer (test.test_socketserver.SocketServerTest) ... skipped 'requires Unix sockets'
test_ForkingUnixStreamServer (test.test_socketserver.SocketServerTest) ... skipped 'requires Unix sockets'
test_ProcessingTCPServer (test.test_socketserver.SocketServerTest) ... creating server
ADDR = ('127.0.0.1', 4730)
CLASS = <class 'socketserver.ProcessingTCPServer'>
server running
test client 0
test client 1
test client 2
waiting for server
done
Warning -- threading_cleanup() failed to cleanup 1 threads (count: 1, dangling: 2)
Dangling thread: <Thread(Thread-3, started daemon 2932)>
Dangling thread: <_MainThread(MainThread, started 3804)>
ok
(...)
Warning -- threading._dangling was modified by test_socketserver
  Before: <_weakrefset.WeakSet object at 0x03330ED0>
  After:  <_weakrefset.WeakSet object at 0x03330E50>

1 test altered the execution environment:
    test_socketserver

vstinner avatar Jun 08 '18 08:06 vstinner

@vstinner, Yup I see that there is a dangling thread. I cannot see what is different when running on Windows that produces the dangling thread that does not happen on Linux.

deliberist avatar Jun 08 '18 09:06 deliberist

This PR is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Aug 14 '22 00:08 github-actions[bot]

FWIW, this could still be merged in, I just never had the time or experience to figure out why the Windows CI tests failed. I've tested the branch on Windows directly and it passed. I do not understand why the CI was failing, but directly on a Windows host passed.

deliberist avatar Aug 14 '22 14:08 deliberist

This PR is stale because it has been open for 30 days with no activity.

github-actions[bot] avatar Feb 27 '23 00:02 github-actions[bot]