cpython
cpython copied to clipboard
bpo-32608: Adding a server into socketserver that handles client connection in new "multiprocessing.Process" processes.
https://bugs.python.org/issue32608
Hello, and thanks for your contribution!
I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA).
Unfortunately our records indicate you have not signed the CLA. For legal reasons we need you to sign this before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue.
Thanks again to your contribution and we look forward to looking at it!
I signed the CLA yesterday, does it take a few days for that to be seen?
For my own sanity, I'm dropping some useful commands I used when building/testing these changes.
Linux builds:
cd cpython
./configure --with-pydebug
make -j
./python -m test.pythoninfo
./python -m test
./python -m test -v test.test_socketserver -uall
Windows builds:
cd cpython
.\PCbuild\clean.bat
.\PCbuild\build.bat -e
.\PCbuild\win32\python.exe -m test.pythoninfo
.\PCbuild\win32\python.exe -m test
.\PCbuild\win32\python.exe -m test -v test.test_socketserver -uall
.\PCbuild\rt.bat -q -uall -u-cpu -rwW --slowest --timeout=1200 --fail-env-changed -j0
Can someone with more Python on Windows experience be able to explain to me why the AppVeyor build is failing?
To me, it looks like the AppVeyor is failing on the test_threading
module (which I did not modify) and the test_socketserver
(which I did modify). However, I do not understand the circumstances where the test is failing on Windows but not on my LinuxMint 18.1 dev machine.
I would like this pull-request to get merged as soon as we can, but I doubt it will make it much farther with one of the build checks failing.
-- Thanks
A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.
Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again
. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.
Hi Antoine (pitrou). I've made the changes you suggested. However the CI builds fail on Windows because of dangling threads that I cannot seem to track down why. Do you have a few minutes have a second set of eyes looking at the errors?
Locally, I print a traceback in threading.py
when a new thread is added to the threading._dangling
WeakSet. This tells me where the code is when it creates the threads that are still dangling at the end of the test_socketserver
tests.
I added this into threading.py
, in Thread.__init__()
:
import sys
import traceback
print()
print('\tNAME = {}'.format(self._name))
traceback.print_stack(file=sys.stdout)
This bug is close to being complete, I cannot figure out why the Process objects are creating these extra threads that are not being cleaned up.
-- Thanks!
I have looked at the code over and over again. I keep coming to the conclusion that the reason the AppVeyor tests are failing only on Windows is because there is a problem with the way Windows handles multiple processes. For some reason, on Windows there is a dangling thread laying around after the unit tests run. I just cannot find exactly where and why a thread gets created but never cleaned up in the Thread._dangling
object.
This scenario does not happen on Linux systems, which leads me to believe this problem is a bit out of scope of this ticket.
Can someone look and confirm? I'd really like to close this ticket so it can be merged into the main baseline, however the failing tests on Windows seems to be preventing this ticket from advancing.
AppVeyor spotted a dangling thread:
0:03:19 [286/416/2] test_socketserver failed (env changed)
test_forking_handled (test.test_socketserver.ErrorHandlerTest) ... skipped 'requires forking'
test_forking_not_handled (test.test_socketserver.ErrorHandlerTest) ... skipped 'requires forking'
test_sync_handled (test.test_socketserver.ErrorHandlerTest) ... ok
test_sync_not_handled (test.test_socketserver.ErrorHandlerTest) ... ok
test_threading_handled (test.test_socketserver.ErrorHandlerTest) ... ok
test_threading_not_handled (test.test_socketserver.ErrorHandlerTest) ... ok
test_all (test.test_socketserver.MiscTestCase) ... ok
test_shutdown_request_called_if_verify_request_false (test.test_socketserver.MiscTestCase) ... ok
test_ForkingTCPServer (test.test_socketserver.SocketServerTest) ... skipped 'requires forking'
test_ForkingUDPServer (test.test_socketserver.SocketServerTest) ... skipped 'requires forking'
test_ForkingUnixDatagramServer (test.test_socketserver.SocketServerTest) ... skipped 'requires Unix sockets'
test_ForkingUnixStreamServer (test.test_socketserver.SocketServerTest) ... skipped 'requires Unix sockets'
test_ProcessingTCPServer (test.test_socketserver.SocketServerTest) ... creating server
ADDR = ('127.0.0.1', 4730)
CLASS = <class 'socketserver.ProcessingTCPServer'>
server running
test client 0
test client 1
test client 2
waiting for server
done
Warning -- threading_cleanup() failed to cleanup 1 threads (count: 1, dangling: 2)
Dangling thread: <Thread(Thread-3, started daemon 2932)>
Dangling thread: <_MainThread(MainThread, started 3804)>
ok
(...)
Warning -- threading._dangling was modified by test_socketserver
Before: <_weakrefset.WeakSet object at 0x03330ED0>
After: <_weakrefset.WeakSet object at 0x03330E50>
1 test altered the execution environment:
test_socketserver
@vstinner, Yup I see that there is a dangling thread. I cannot see what is different when running on Windows that produces the dangling thread that does not happen on Linux.
This PR is stale because it has been open for 30 days with no activity.
FWIW, this could still be merged in, I just never had the time or experience to figure out why the Windows CI tests failed. I've tested the branch on Windows directly and it passed. I do not understand why the CI was failing, but directly on a Windows host passed.
This PR is stale because it has been open for 30 days with no activity.