ipykernel icon indicating copy to clipboard operation
ipykernel copied to clipboard

ipykernel can start debugpy on an occupied port

Open roblourens opened this issue 3 years ago • 1 comments

I am troubleshooting some flaky tests for our jupyter notebook debugger in vscode. I saw an instance where we tried to start the debugger and never got a response. I see this message in the log:

2022-08-31T09:09:49.4948164Z info 9:9:49.487:  Debug IO Pub message: {"header":{"msg_id":"7c8c5c89-6f267f41f37c03b50f0f97d9_5315_29","msg_type":"error","username":"runner","session":"7c8c5c89-6f267f41f37c03b50f0f97d9","date":"2022-08-31T09:09:49.481276Z","version":"5.3"},"msg_id":"7c8c5c89-6f267f41f37c03b50f0f97d9_5315_29","msg_type":"error","parent_header":{"msg_id":"7c8c5c89-6f267f41f37c03b50f0f97d9_5315_27","msg_type":"execute_request","username":"runner","session":"7c8c5c89-6f267f41f37c03b50f0f97d9","date":"2022-08-31T09:09:48.672109Z","version":"5.3"},"metadata":{},"content":{"traceback":["\u001b[0;31m---------------------------------------------------------------------------\u001b[0m","\u001b[0;31mRuntimeError\u001b[0m                              Traceback (most recent call last)","Input \u001b[0;32mIn [1]\u001b[0m, in \u001b[0;36m<cell line: 1>\u001b[0;34m()\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[38;5;28;01mimport\u001b[39;00m \u001b[38;5;21;01mdebugpy\u001b[39;00m;\u001b[43mdebugpy\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mlisten\u001b[49m\u001b[43m(\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43m127.0.0.1\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[38;5;241;43m42323\u001b[39;49m\u001b[43m)\u001b[49m\u001b[43m)\u001b[49m\n","File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.6/x64/lib/python3.10/site-packages/debugpy/public_api.py:31\u001b[0m, in \u001b[0;36m_api.<locals>.apply.<locals>.wrapper\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m     28\u001b[0m \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01mdebugpy\u001b[39;00m\u001b[38;5;21;01m.\u001b[39;00m\u001b[38;5;21;01mserver\u001b[39;00m \u001b[38;5;28;01mimport\u001b[39;00m api\n\u001b[1;32m     30\u001b[0m wrapped \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mgetattr\u001b[39m(api, f\u001b[38;5;241m.\u001b[39m\u001b[38;5;18m__name__\u001b[39m)\n\u001b[0;32m---> 31\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mwrapped\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43margs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n","File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.6/x64/lib/python3.10/site-packages/debugpy/server/api.py:140\u001b[0m, in \u001b[0;36m_starts_debugging.<locals>.debug\u001b[0;34m(address, **kwargs)\u001b[0m\n\u001b[1;32m    138\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m func(address, settrace_kwargs, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs)\n\u001b[1;32m    139\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mException\u001b[39;00m:\n\u001b[0;32m--> 140\u001b[0m     \u001b[43mlog\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mreraise_exception\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;132;43;01m{0}\u001b[39;49;00m\u001b[38;5;124;43m() failed:\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mfunc\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[38;5;18;43m__name__\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mlevel\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43minfo\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m)\u001b[49m\n","File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.6/x64/lib/python3.10/site-packages/debugpy/server/api.py:138\u001b[0m, in \u001b[0;36m_starts_debugging.<locals>.debug\u001b[0;34m(address, **kwargs)\u001b[0m\n\u001b[1;32m    135\u001b[0m     settrace_kwargs[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdont_trace_end_patterns\u001b[39m\u001b[38;5;124m\"\u001b[39m] \u001b[38;5;241m=\u001b[39m (\u001b[38;5;28mstr\u001b[39m(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mdebugpy_launcher.py\u001b[39m\u001b[38;5;124m\"\u001b[39m),)\n\u001b[1;32m    137\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m--> 138\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[43maddress\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43msettrace_kwargs\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m    139\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mException\u001b[39;00m:\n\u001b[1;32m    140\u001b[0m     log\u001b[38;5;241m.\u001b[39mreraise_exception(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;132;01m{0}\u001b[39;00m\u001b[38;5;124m() failed:\u001b[39m\u001b[38;5;124m\"\u001b[39m, func\u001b[38;5;241m.\u001b[39m\u001b[38;5;18m__name__\u001b[39m, level\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124minfo\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n","File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.6/x64/lib/python3.10/site-packages/debugpy/server/api.py:240\u001b[0m, in \u001b[0;36mlisten\u001b[0;34m(address, settrace_kwargs)\u001b[0m\n\u001b[1;32m    237\u001b[0m log\u001b[38;5;241m.\u001b[39minfo(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mEndpoints received from adapter: \u001b[39m\u001b[38;5;132;01m{0}\u001b[39;00m\u001b[38;5;124m\"\u001b[39m, json\u001b[38;5;241m.\u001b[39mrepr(endpoints))\n\u001b[1;32m    239\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124merror\u001b[39m\u001b[38;5;124m\"\u001b[39m \u001b[38;5;129;01min\u001b[39;00m endpoints:\n\u001b[0;32m--> 240\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mRuntimeError\u001b[39;00m(\u001b[38;5;28mstr\u001b[39m(endpoints[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124merror\u001b[39m\u001b[38;5;124m\"\u001b[39m]))\n\u001b[1;32m    242\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[1;32m    243\u001b[0m     server_host \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mstr\u001b[39m(endpoints[\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mserver\u001b[39m\u001b[38;5;124m\"\u001b[39m][\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mhost\u001b[39m\u001b[38;5;124m\"\u001b[39m])\n","\u001b[0;31mRuntimeError\u001b[0m: Can't listen for client connections: [Errno 98] Address already in use"],"ename":"RuntimeError","evalue":"Can't listen for client connections: [Errno 98] Address already in use"},"buffers":[],"channel":"iopub"}
2022-08-31T09:09:49.4981656Z info 9:9:49.489:  [32mWidget Coordinator received IPyWidgets_msg_received

If I clean the error message up a bit...

Traceback (most recent call last)
Input In [1], in <cell line: 1>()
----> 1 import debugpy;debugpy.listen((\"127.0.0.1\",42323))
","File /opt/hostedtoolcache/Python/3.10.6/x64/lib/python3.10/site-packages/debugpy/public_api.py:31, in _api.<locals>.apply.<locals>.wrapper(*args, **kwargs)
     28 from debugpy.server import api
     30 wrapped = getattr(api, f.__name__)
---> 31 return wrapped(*args, **kwargs)
","File /opt/hostedtoolcache/Python/3.10.6/x64/lib/python3.10/site-packages/debugpy/server/api.py:140, in _starts_debugging.<locals>.debug(address, **kwargs)
    138     return func(address, settrace_kwargs, **kwargs)
    139 except Exception:
--> 140     log.reraise_exception(\"{0}() failed:\", func.__name__, level=\"info\")
","File /opt/hostedtoolcache/Python/3.10.6/x64/lib/python3.10/site-packages/debugpy/server/api.py:138, in _starts_debugging.<locals>.debug(address, **kwargs)
    135     settrace_kwargs[\"dont_trace_end_patterns\"] = (str(\"debugpy_launcher.py\"),)
    137 try:
--> 138     return func(address, settrace_kwargs, **kwargs)
    139 except Exception:
    140     log.reraise_exception(\"{0}() failed:\", func.__name__, level=\"info\")
","File /opt/hostedtoolcache/Python/3.10.6/x64/lib/python3.10/site-packages/debugpy/server/api.py:240, in listen(address, settrace_kwargs)
    237 log.info(\"Endpoints received from adapter: {0}\", json.repr(endpoints))
    239 if \"error\" in endpoints:
--> 240     raise RuntimeError(str(endpoints[\"error\"]))
    242 try:
    243     server_host = str(endpoints[\"server\"][\"host\"])
","RuntimeError: Can't listen for client connections: [Errno 98] Address already in use

I see that ipykernel picks a free port here, and sends it to debugpy https://github.com/ipython/ipykernel/blob/cce9b7d736d2cdd32e8695436dbed77bc7912080/ipykernel/debugger.py#L231-L239

and in the error message, debugpy is saying that the port is not free. I know it's not really possible to do this kind of thing in an atomic way. Is it possible for ipykernel to detect this error and retry with a new port? Should debugpy have a mode where it can pick a free port and return it? Or should a client like vscode detect this error and retry?

Our test run: https://github.com/microsoft/vscode-jupyter/runs/8109923123?check_suite_focus=true

roblourens avatar Sep 10 '22 01:09 roblourens

Not sure if it helps, but here is a link to how jupyter-server finds a port with retries; it not only checks for address being already in use but also for permission errors.

krassowski avatar Sep 12 '22 23:09 krassowski