pyzmq icon indicating copy to clipboard operation
pyzmq copied to clipboard

pyzmq 26, Windows 11: Bad file descriptor

Open malgarop opened this issue 10 months ago • 16 comments

This is a pyzmq bug

  • [X] This is a pyzmq-specific bug, not an issue of zmq socket behavior. Don't worry if you're not sure! We'll figure it out together.

What pyzmq version?

26.0.0

What libzmq version?

none

Python version (and how it was installed)

3.12.3

OS

windows 11

What happened?

When installing jupyter on a new windows machine, it will fail to connect to any python3 kernel, with the following error popup: "Error Starting KernelNetworkError when attempting to fetch resource."

To reproduce: On a Powershell on a Windows 11 machine with python 3.12.3:

'pip install jupyter'
'python -m jupyter notebook' or 'jupyter-lab'
start a new notebook, or open an existing one, with python3 kernel
process fails and error pops up. In the shell I get the following output:
"Bad file descriptor (C:\Users\runneradmin\AppData\Local\Temp\tmp9czwcdbh\build_deps\bundled_libzmq-src\src\epoll.cpp:73)"

It works after I uninstall pyzmq and reinstall the Dec 5, 2023 version:

'pip freeze | findstr pyzmq'
output: "pyzmq==26.0.0"
'pip uninstall pyzmq'
'pip install pyzmq==25.1.2'

Code to reproduce bug

pip install jupyter
python -m jupyter notebook' or 'jupyter-lab

Traceback, if applicable

Bad file descriptor (C:\Users\runneradmin\AppData\Local\Temp\tmp9czwcdbh\build_deps\bundled_libzmq-src\src\epoll.cpp:73)

More info

No response

malgarop avatar Apr 19 '24 16:04 malgarop

Can I ask how you installed pyzmq and Python? Do you have any other Python versions available to test?

Can you run a basic script like:

import zmq
ctx = zmq.Context()
with ctx:
    with ctx.socket(zmq.PUSH) as s:
        s.bind("tcp://127.0.0.1:5555")

minrk avatar Apr 19 '24 17:04 minrk

same issue here. FYI @minrk, I've just run your code to confirm.

import zmq
ctx = zmq.Context()
with ctx:
    with ctx.socket(zmq.PUSH) as s:
        s.bind("tcp://127.0.0.1:5555")
Bad file descriptor (C:\Users\runneradmin\AppData\Local\Temp\tmpuk6z2au_\build\_deps\bundled_libzmq-src\src\epoll.cpp:73)

Also using python 3.12.3 via microsoft store(WIN 11)

grvstick avatar Apr 22 '24 08:04 grvstick

@minrk thanks for your patience, running the scripts you suggested results in: Bad file descriptor (C:\Users\runneradmin\AppData\Local\Temp\tmps23b75cz\build\_deps\bundled_libzmq-src\src\epoll.cpp:73)

I installed python via Mirosoft store on windows 11. Python pip installed pyzmq for me as a dependency for the module jupyter

malgarop avatar Apr 22 '24 13:04 malgarop

Interesting that you both got it via the Store. Do you have access to another source, e.g. a miniconda install or Python.org installer? Just to test if it's related to the Python installation instead of the environment?

minrk avatar Apr 23 '24 07:04 minrk

@minrk just confirmed that it does not happen with python.org release of 3.12.3, both on jupyter lab and the test code you provided. I've failed to test conda, since jupyter is only seems to be able to run in conda packages, not pip. I wanted to test in pyzmq >= 26.0.0

grvstick avatar Apr 25 '24 01:04 grvstick

For what it's worth, I get the same error using Python 3.11.9. I don't recall how I installed it.

jochoaAvant avatar May 06 '24 13:05 jochoaAvant

Same here! I had the same error and the way i fixed it was by downgrading pyzmq to the version 25.1.2 (I was on the 26.0.3 version). Run windows powershell and unintstall the current version by using this command:

python -m pip uninstall pyzmq

Then install the version i talked about:

pip install pyzmq==25.1.2

Hope it helps!

r23-dias avatar May 07 '24 10:05 r23-dias

  • same

EliAxcel avatar May 13 '24 18:05 EliAxcel

Same here, my configuration is the same as the opener of the issue

fnicolaim avatar May 15 '24 13:05 fnicolaim

Still getting this error with 26.0.3

bb-at-ss avatar May 30 '24 17:05 bb-at-ss

I'd like to point out this is not the first time this exact issue happens: https://github.com/microsoft/vscode-jupyter/issues/8630 A test may be due.

AgentMC avatar Jun 04 '24 13:06 AgentMC

That vscode issue is not at all related, as far as I can tell. Wrong link?

If you know how to write a test for this, I would love it. I test every situation I can find, but I've never been able to reproduce any of these issues myself, which makes testing very hard.

minrk avatar Jun 04 '24 14:06 minrk

@minrk Well it's up to you to judge but the optics are the same: Windows, MS Store installation of Python, latest pyzmq installed automatically and image Except in my case it said that bad file descriptor was stderr when trying to launch an ipynb. That 8630 is how I found this repo and issue.

Sure I may be missing details but regardless, 25.1.2 solves it for me, just like in that entry downgrade solved it for them. ¯\_(ツ)_/¯

AgentMC avatar Jun 04 '24 15:06 AgentMC

It is in the same general category of "a libzmq build is not compatible with a specific Windows situation" which can have many different causes, but often looks the same. Neither the affected Windows/Python nor the affected libzmq are the same, so no test could have caught both failures, and it's frustrating for me because it has nothing to do with pyzmq, it's all in libzmq.

Fortunately, it looks like this one might be feasible to reproduce, unlike the previous round, and the pyzmq 26 build system has more control over how libzmq is built, so I have a little more hope that someone can figure this one out.

Thanks everyone for reporting!

minrk avatar Jun 04 '24 17:06 minrk

Thanks man, u saved me after 2 days of hitting my head on a brick wall

Alessandrob99 avatar Jun 06 '24 11:06 Alessandrob99

Also encountered this issue on my new computer, but a few days ago I was able to start Jupyter using pyzmq==26.0.3 on my old computer

Morsiusiurandum avatar Jul 08 '24 14:07 Morsiusiurandum

As you may already know, this may be due to the character encoding of the username. This issue seems to be common in countries like China and Japan. https://programmerah.com/bad-file-descriptorccizeromq-1602704446950worksrcepoll-cpp100-41232/ https://qiita.com/nijigen_plot/items/6d97f906af8940e22693

P.S.: This problem occurs even if you install everything together using anaconda. I think this has a negative effect not only on Jupyter but also on the startup of Spyder.

tinasiti avatar Aug 06 '24 11:08 tinasiti

I don't know. I am in the US. Normal western alphabet, normal western name, etc. I still experience the issue and have to downgrade every time I create a new venv.

jochoaAvant avatar Aug 06 '24 14:08 jochoaAvant

I've have this problem, I think that the problem may be cause because I got 2 python versions installed, and once of them did not have the jupyter's library installed correctly. For fixin I just did uninstall all of python versions and reinstall python and Jupyter, and thats it, try yourself

WILLIAMSV10 avatar Aug 07 '24 22:08 WILLIAMSV10

If anyone who can reproduce this could test the relevant wheel from this build (scroll down to "Artifacts" and pick the Windows zip that maches your arch):

and report back if you get the same errors, different ones, or if the errors go away, that would be a huge help. I've tried, but continue to fail to find any way to get access to a Windows Store-installed Python to test this myself.

minrk avatar Aug 16 '24 12:08 minrk

Bad file descriptor (C:\Users\runneradmin\AppData\Local\Temp\tmppy2n81h4\build_deps\bundled_libzmq-src\src\epoll.cpp:73) I didn't change the code, the system, or the configuration. I just restarted the python script and it appeared. And neither upgrading nor uninstalling pyzmq can solve it.

wlfzsd avatar Aug 20 '24 10:08 wlfzsd

Finally lowered the version to 25.1.2 and finally solved it

Bad file descriptor (C:\Users\runneradmin\AppData\Local\Temp\tmppy2n81h4\build_deps\bundled_libzmq-src\src\epoll.cpp:73) I didn't change the code, the system, or the configuration. I just restarted the python script and it appeared. And neither upgrading nor uninstalling pyzmq can solve it.

wlfzsd avatar Aug 20 '24 10:08 wlfzsd

@wlfzsd thanks for reporting! Can you please share more information about:

  1. your Windows system (e.g. python3 -m platform
  2. how you installed Python (Windows store, conda, Python.org, etc.), including python -c "import sys; print(sys.platform)")
  3. any firewalls, VPNs, or antivirus tools you might be using? It seems like network state can cause things like this to fail, if this is still #1505, which makes sense because pyzmq 26 re-enabled epoll/IPC, which was ultimately the source of #1505, the first time this came up.

If anyone who sees this error can test with:

  • pyzmq 26.1.0
  • the appropriate wheel from this PR

and let me know if either or both fixes the problem, that would be a huge help.

minrk avatar Aug 20 '24 13:08 minrk

I was finally able to create an amd64 VM with Windows 11 and the MS Store. Unfortunately, when I install Python from the store and install pyzmq (both 26.0.0 and 26.1.0), everything is fine. So while that may be relevant in some way for some users, it is not sufficient to reproduce the error. So I continue to be at a complete loss for how to verify if this can ever be fixed.

minrk avatar Aug 20 '24 13:08 minrk

Eu fiz um downgrade do pyzmq para a versão 25 e funcionou bem.

Jofrejaime avatar Aug 21 '24 17:08 Jofrejaime

I finally managed to trigger this! On one of my VMs, when I create a user with a non-ascii username (日本語), the error is produced. Same Python, same pyzmq with an ascii username has no issue. So somewhere wepoll and/or libzmq and/or Windows itself is doing something really weird.

That means that in at least one case, I can confirm #2024 fixes this (the same fix as earlier)

I believe from above reports that this is not the only thing that causes epoll to fail on Windows, but it is enough, and unavoidable for affected users, that I think disabling epoll until libzmq can come up with a fix is the right move.

minrk avatar Aug 22 '24 08:08 minrk

This should be fixed by pyzmq 26.2. If anyone can still reproduce this with 26.2, let me know as much information as you can about your system. And if you saw this bug but 26.2 fixes it for you, that would be great to hear, as well!

minrk avatar Aug 22 '24 09:08 minrk

@wlfzsd thanks for reporting! Can you please share more information about:

  1. your Windows system (e.g. python3 -m platform
  2. how you installed Python (Windows store, conda, Python.org, etc.), including python -c "import sys; print(sys.platform)")
  3. any firewalls, VPNs, or antivirus tools you might be using? It seems like network state can cause things like this to fail, if this is still Windows Python wheels: Bad address src\epoll.cpp:100 #1505, which makes sense because pyzmq 26 re-enabled epoll/IPC, which was ultimately the source of Windows Python wheels: Bad address src\epoll.cpp:100 #1505, the first time this came up.

If anyone who sees this error can test with:

  • pyzmq 26.1.0
  • the appropriate wheel from this PR

and let me know if either or both fixes the problem, that would be a huge help.

1.Windows-10-10.0.20348-SP0 2.Python.org 3.python -c "import sys; print(sys.platform)"
win32 4.The server is running without any firewalls or VPN-related network tools. The sender is using Python 3.11, and the receiver is using 3.12 (both are experiencing the same issue). Both the script path and script name are in English and have not been modified. Initially, the script ran normally for several days and has never encountered any exceptions. Before the exception occurred, the script was communicating normally; the error only appeared after restarting the script.

wlfzsd avatar Aug 22 '24 13:08 wlfzsd

If anyone can reproduce this error with pyzmq>=26,<26.2, it would be a huge help if you can test one of the wheels from this build

  1. scroll down to "Artifacts"
  2. download and extract the zip matching your architecture (probably wheels-win_amd64)
  3. install with pip install --force-reinstall path\to\wheels\pyzmq-...whl for your Python (probably pyzmq-26.2.1.dev0-cp312-cp312-win_amd64.whl for most)
  4. run this script:
import zmq
print(zmq.__version__)
ctx = zmq.Context()
with ctx:
    with ctx.socket(zmq.PUSH) as s:
        s.bind("tcp://127.0.0.1:5555")
print("ok")

and share the result. Please also share the values for:

  1. how was Python installed? (store, Python.org, etc.)
  2. sys.executable
  3. sys.version

minrk avatar Aug 26 '24 13:08 minrk