Exception ignored message when creating failed + valid session
Hi!
During work on one of my PRs we ran into some exception ignored messages such as:
Exception ignored in: <function Connection.__del__ at 0x7af078ff67a0>
Traceback (most recent call last):
File "/home/qubix/.pyenv/versions/3.11.6/lib/python3.11/site-packages/irods/connection.py", line 132, in __del__
self.disconnect()
File "/home/qubix/.pyenv/versions/3.11.6/lib/python3.11/site-packages/irods/connection.py", line 361, in disconnect
self.send(disconnect_msg)
File "/home/qubix/.pyenv/versions/3.11.6/lib/python3.11/site-packages/irods/connection.py", line 148, in send
raise NetworkException("Unable to send message")
irods.exception.NetworkException: Unable to send message
Even though the functionality in iBridges still works fine, this is of course somewhat confusing to the user/us. We first thought it was caused by the new PR, but it seems to originate from the PRC, since installing an old version of PRC removes the message. For reproducing I have found this works relatively consistently:
import sys
from irods.session import iRODSSession
if __name__ == "__main__":
env_file = sys.argv[1]
try:
session = iRODSSession(password="bad_pw", irods_env_file=env_file)
print(session.server_version)
except Exception:
pass
session = iRODSSession(password="good_pw", irods_env_file=env_file)
print(session.server_version)
If you can't reproduce it the first time, it might reproduce the second time in my experience.
Looking through the PRC release notes, if found that https://github.com/irods/python-irodsclient/commit/f2332b3ec9bba48913d9a2500391e4f1399248e9 might be related to this, since I think the issue appears in the cleanup phase.
Let me know if you need more information (I get 4.2.12 for the server version)).
@qubixes What's the scheme used by your client env file ? With native authentication, I wasn't able to reproduce the issue.
Also (again to @qubixes ) - assuming I'm able to reproduce at some point - which older version of PRC corrects the problem?
Will try with SSL enabled. Sometimes that makes a difference.
@qubixes Since the error is incurred while trying to send a disconnect message to the server, on a socket the iRODS server has already given up on, I am pretty sure functionality won't be threatened by the issue underlying this error message.
Perhaps it would be an option to turn this into an INFO level of logged message, instead of an actual exception. Will discuss with the rest of the iRODS Team.
@d-w-moore I'll just paste my irods_environment.json for completeness:
{
"irods_host": "geo.data.uu.nl",
"irods_port": 1247,
"irods_home": "/nluu11p/home",
"irods_user_name": "<snip>",
"irods_default_resource": "irodsResc",
"irods_zone_name": "nluu11p",
"irods_authentication_scheme": "pam",
"irods_encryption_algorithm": "AES-256-CBC",
"irods_encryption_key_size": 32,
"irods_encryption_num_hash_rounds": 16,
"irods_encryption_salt_size": 8,
"irods_client_server_policy": "CS_NEG_REQUIRE",
"irods_client_server_negotiation": "request_server_negotiation"
}
Will try with SSL enabled. Sometimes that makes a difference.
@qubixes Since the error is incurred while trying to send a disconnect message to the server, on a socket the iRODS server has already given up on, I am pretty sure functionality won't be threatened by the issue underlying this error message.
Perhaps it would be an option to turn this into an INFO level of logged message, instead of an actual exception. Will discuss with the rest of the iRODS Team.
Yes, functionally there are no problems, it's just the messaging to the user.
Also (again to @qubixes ) - assuming I'm able to reproduce at some point - which older version of PRC corrects the problem?
I did a little bit of a manual bisect, and I get the problem with PRC==2.2.0, but not with PRC==2.1.0. Let me know if you need me to bisect on commits.
not too many to eyeball there...
https://github.com/irods/python-irodsclient/compare/v2.1.0...v2.2.0
not too many to eyeball there...
No, but I see 2 or 3 that are potentially influencing changes.
Since unhandled exceptions aren't good in finalizers, this loose end should be tied up neatly and logged. But the logging shouldn't be seen at the default level of WARNING, so let's do an INFO level message that simply displays the object representation of the Exception.
It's still elusive to try and reproduce this error, so although I have what I think is a fix, I can't really test it.
For now, keep your solution to this in its own branch until we can reproduce it. That keeps other PRs from being blocked by this.
It's still elusive to try and reproduce this error, so although I have what I think is a fix, I can't really test it.
I could test it if you want!
It's still elusive to try and reproduce this error, so although I have what I think is a fix, I can't really test it.
I could test it if you want!
Yes, it's certainly a help to find conditions that reproduce it; could help us to get a fix in sooner.
@d-w-moore - Please share the branch and/or diff with the experimental fix and perhaps @qubixes can apply it and try it out?
@d-w-moore - Please share the branch and/or diff with the experimental fix and perhaps @qubixes can apply it and try it out?
Good idea, will do.
@qubixes Please see dev branch https://github.com/d-w-moore/python-irodsclient/tree/691.m , the very latest commit 3239d54bc9191c36ad6d3ba189ef6ba2f8b85d73, as a basis for how to test for whether my existing patch helps in this issue
@qubixes Please see dev branch https://github.com/d-w-moore/python-irodsclient/tree/691.m , the very latest commit 3239d54, as a basis for how to test for whether my existing patch helps in this issue
Yes, it does help the issue, I don't get a message about the connection anymore with the default logging level.
I can still reproduce the error message for the latest version on PyPI (3.0.0).
Good.
@d-w-moore Does @qubixes' recent success with the change help us understand the problem?
Good.
@d-w-moore Does @qubixes' recent success with the change help us understand the problem?
Not on any grand level. But it does suppress a message that was potentially not useful for the application user (ie noise control) -- provided the default WARNING log level isn't decreased for the log message in question. It also means we've put a bandage on what was a potential troublemaker, since exceptions should never allowed to be unhandled within the scope of the __del__ method of an object.
@qubixes - I tried again with iRODS 4.2.12, PRC v2.2.0, with PAM & SSL login, using your script above almost exactly, and am still not able to reproduce, which means I can't really write a test.
Can you think of any other special conditions (or modifications relative to a "fresh" install) that apply to your system? Or perhaps a set of steps you go through that is guaranteed to produce the error?
@d-w-moore I have asked @stsnel, since he knows a lot more about the system than I do. For now we have the server logs from the server side from when I run the script:
Mar 25 09:17:19 pid:6687 NOTICE: writeLine: inString = {<email>#nluu12p} Agent process started from <ip1>
Mar 25 09:17:19 pid:6637 remote addresses: <ip2>, <ip1> ERROR: [-] /irods_source/server/core/src/rsApiHandler.cpp:542:int readAndProcClientM
sg(rsComm_t *, int) : status [SYS_HEADER_READ_LEN_ERR] errno [] -- message [failed to call 'read header']
[-] /irods_source/lib/core/src/sockComm.cpp:198:irods::error readMsgHeader(irods::network_object_ptr, msgHeader_t *, struct timeval *) : status [SYS
_HEADER_READ_LEN_ERR] errno [] -- message [failed to call 'read header']
[-] /irods_source/plugins/network/ssl/libssl.cpp:574:irods::error ssl_read_msg_header(irods::plugin_context &, void *, struct timeval *) : s
tatus [SYS_HEADER_READ_LEN_ERR] errno [] -- message [read 0 expected 4]
Mar 25 09:17:19 pid:6637 remote addresses: <ip2>, <ip1> ERROR: [-] /irods_source/server/core/src/rodsAgent.cpp:577:int runIrodsAgentFactory(
sockaddr_un) : status [SSL_SHUTDOWN_ERROR] errno [] -- message [failed to call 'agent stop']
[-] /irods_source/lib/core/src/sockComm.cpp:160:irods::error sockAgentStop(irods::network_object_ptr) : status [SSL_SHUTDOWN_ERROR] errno [] -- message [failed to call 'agent stop']
[-] /irods_source/plugins/network/ssl/libssl.cpp:954:irods::error ssl_agent_stop(irods::plugin_context &) : status [SSL_SHUTDOWN_ERROR] errno [] -- message [error completing shutdown of SSL connection]
Preliminary results for UU Yoda 1.9.x environment (based on iRODS 4.2.12) with Python 3.12.3 client:
- Python-irodsclient 2.0.0: issue does not occur ✅
- Python-irodsclient 2.1.0: issue does not occur ✅
- Python-irodsclient 2.2.0: issue occurs reliably 🔴
- Python-irodsclient 3.0.0: issue does not occur ✅
- Python-irodsclient 3.1.0: issue occurs reliably 🔴
If I run a bisect between 3.0.0 and 3.1.0, I get:
bee82948ba15227b7cca990b133f8d2b829533cd is the first bad commit
Link to this commit: https://github.com/irods/python-irodsclient/commit/bee82948ba15227b7cca990b133f8d2b829533cd
I'll check with @qubixes to see if we can compare notes and gather more information.
Checking into this again. It could be sensitive to the slightest things, and I'm beginning to think the Python version makes the difference (It wouldn't be the first time, esp where interpreter exit / GC issues are concerned). I had not been testing with anything as recent as Python 3.11 or 3.12....
@stsnel , I was wondering what you use in your bisect to determine success/failure. Do you have a ready-made script or were you evaluating the condition manually?
@stsnel , I was wondering what you use in your bisect to determine success/failure. Do you have a ready-made script or were you evaluating the condition manually?
I was using the script provided by Qubixes at https://github.com/irods/python-irodsclient/issues/691#issue-2918364743 . I invoked it three times per PRC version to determine if the failure occurred (sort of) consistently.
Checking into this again. It could be sensitive to the slightest things, and I'm beginning to think the Python version makes the difference (It wouldn't be the first time, esp where interpreter exit / GC issues are concerned). I had not been testing with anything as recent as Python 3.11 or 3.12....
Right, that makes sense, I also see that I get different results with Python 3.9.20 on my system:
- Python-irodsclient 2.0.0: issue does not occur ✅
- Python-irodsclient 2.1.0: issue does not occur ✅
- Python-irodsclient 2.2.0: issue occurs reliably 🔴
- Python-irodsclient 3.0.0: issue occurs reliably 🔴
- Python-irodsclient 3.1.0: issue occurs reliably 🔴
Unfortunately I'm not getting consistent results with yours , @stsnel and @qubixes . Still unable to reproduce the error dump:
(py312) testuser@ac99544526dd:/tmp$ ./pamss691.py ~/.irods//irods_environment.json
session.server_version = (4, 2, 12)
session.pool.account.authentication_scheme = 'pam'
sys.version = '3.12.3 (main, Mar 27 2025, 12:30:19) [GCC 7.5.0]'
type(list(session.pool.idle|session.pool.active)[0].socket) = <class 'ssl.SSLSocket'>
(py312) testuser@ac99544526dd:/tmp$ ./pamss691.py ~/.irods//irods_environment.json
session.server_version = (4, 2, 12)
session.pool.account.authentication_scheme = 'pam'
sys.version = '3.12.3 (main, Mar 27 2025, 12:30:19) [GCC 7.5.0]'
type(list(session.pool.idle|session.pool.active)[0].socket) = <class 'ssl.SSLSocket'>
(py312) testuser@ac99544526dd:/tmp$ ./pamss691.py ~/.irods//irods_environment.json
session.server_version = (4, 2, 12)
session.pool.account.authentication_scheme = 'pam'
sys.version = '3.12.3 (main, Mar 27 2025, 12:30:19) [GCC 7.5.0]'
type(list(session.pool.idle|session.pool.active)[0].socket) = <class 'ssl.SSLSocket'>
(py312) testuser@ac99544526dd:/tmp$ ./pamss691.py ~/.irods//irods_environment.json
session.server_version = (4, 2, 12)
session.pool.account.authentication_scheme = 'pam'
sys.version = '3.12.3 (main, Mar 27 2025, 12:30:19) [GCC 7.5.0]'
type(list(session.pool.idle|session.pool.active)[0].socket) = <class 'ssl.SSLSocket'>
(py312) testuser@ac99544526dd:/tmp$
Using this script and alice as the PAM login with password 'apass'
(py312) testuser@ac99544526dd:/tmp$ more pamss691.py
#!/usr/bin/env python3
import sys
from irods.session import iRODSSession
if __name__ == "__main__":
env_file = sys.argv[1]
try:
session = iRODSSession(password="apass1", irods_env_file=env_file)
print(session.server_version)
except Exception:
pass
session = iRODSSession(password="apass", irods_env_file=env_file)
print(f'{session.server_version = }')
print(f'{session.pool.account.authentication_scheme = }')
print(f'{sys.version = }')
print(f'{type(list(session.pool.idle|session.pool.active)[0].socket) = }')
and in the same terminal session
(py312) testuser@ac99544526dd:/tmp$ python -c 'import irods;print(irods.__version__)'
2.2.0
(py312) testuser@ac99544526dd:/tmp$
@stsnel - Any changes to the irods.client_configuration settings and/or environment variables that are in force (related to those settings or perhaps to Yoda) may be influencing the results. Also, which OS are you running on?
Hi @d-w-moore,
I've performed these tests on a laptop running Ubuntu 24.04 LTS.
I don't see any relevant environment variables that could influence this behaviour. Please find the settings in the irods_environment file on the client below:
$ grep -vE "(irods_home|irods_user_name|irods_zone_name|irods_host)" ~/.irods/irods_environment.json
{
"irods_authentication_scheme": "pam",
"irods_client_server_negotiation": "request_server_negotiation",
"irods_client_server_policy": "CS_NEG_REQUIRE",
"irods_port": 1247,
"irods_default_resource": "irodsResc",
"irods_encryption_algorithm": "AES-256-CBC",
"irods_encryption_key_size": 32,
"irods_encryption_num_hash_rounds": 16,
"irods_encryption_salt_size": 8,
}
Settings on the provider that we're connecting to:
$ grep -E "(irods_client_server|irods_encryption)" .irods/irods_environment.json
"irods_client_server_negotiation": "request_server_negotiation",
"irods_client_server_policy": "CS_NEG_REQUIRE",
"irods_encryption_algorithm": "AES-256-CBC",
"irods_encryption_key_size": 32,
"irods_encryption_num_hash_rounds": 16,
"irods_encryption_salt_size": 8,
I can reproduce the issue with your script in combination with Python 3.12.3 and PRC 3.1.0:
$ ./test-daniel.py ~/.irods/irods_environment.json
session.server_version = (4, 2, 12)
session.pool.account.authentication_scheme = 'pam'
sys.version = '3.12.3 (main, Jan 17 2025, 18:03:48) [GCC 13.3.0]'
type(list(session.pool.idle|session.pool.active)[0].socket) = <class 'ssl.SSLSocket'>
Exception ignored in: <function Connection.__del__ at 0x7e6ae95fc900>
Traceback (most recent call last):
File "/some-directory/venv/lib/python3.12/site-packages/irods/connection.py", line 147, in __del__
self.disconnect()
File "/some-directory/venv/lib/python3.12/site-packages/irods/connection.py", line 376, in disconnect
self.send(disconnect_msg)
File "/some-directory/venv/lib/python3.12/site-packages/irods/connection.py", line 163, in send
raise NetworkException("Unable to send message")
irods.exception.NetworkException: Unable to send message
If I print a traceback for the exception that is raised when creating the first session, it seems to complain about a missing import:
Traceback (most recent call last):
File "/some-directory/venv/lib/python3.12/site-packages/irods/session.py", line 416, in __server_version
conn = next(iter(self.pool.active))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
StopIteration
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/some-directory/venv/lib/python3.12/site-packages/irods/pool.py", line 93, in get_connection
conn = self.idle.pop()
^^^^^^^^^^^^^^^
KeyError: 'pop from an empty set'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/some-directory/venv/lib/python3.12/site-packages/irods/connection.py", line 600, in _login_pam
output_message = self.recv()
^^^^^^^^^^^
File "/some-directory/venv/lib/python3.12/site-packages/irods/connection.py", line 196, in recv
raise exc
irods.exception.PAM_AUTH_PASSWORD_FAILED: None
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/some-directory/./test-daniel.py", line 11, in <module>
print(session.server_version)
^^^^^^^^^^^^^^^^^^^^^^
File "/some-directory/venv/lib/python3.12/site-packages/irods/session.py", line 379, in server_version
return self._server_version()
^^^^^^^^^^^^^^^^^^^^^^
File "/some-directory/venv/lib/python3.12/site-packages/irods/session.py", line 412, in _server_version
return self.__server_version() if version_func is None else version_func(self)
^^^^^^^^^^^^^^^^^^^^^^^
File "/some-directory/venv/lib/python3.12/site-packages/irods/session.py", line 419, in __server_version
conn = self.pool.get_connection()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/some-directory/venv/lib/python3.12/site-packages/irods/pool.py", line 18, in method_
ret = method(self, *s, **kw)
^^^^^^^^^^^^^^^^^^^^^^
File "/some-directory/venv/lib/python3.12/site-packages/irods/pool.py", line 117, in get_connection
conn = Connection(self, self.account)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/some-directory/venv/lib/python3.12/site-packages/irods/connection.py", line 123, in __init__
self._login_pam()
File "/some-directory/venv/lib/python3.12/site-packages/irods/connection.py", line 601, in _login_pam
except irods.exception.PAM_AUTH_PASSWORD_INVALID_TTL as exc:
^^^^^
NameError: name 'irods' is not defined
So it looks like connection.py needs an additional import here. However, adding the import does not resolve the issue, so this does not look related. Have submitted a separate PR for this: https://github.com/irods/python-irodsclient/pull/705
I can reproduce the issue with your script in combination with Python 3.12.3 and PRC 3.1.0:
$ ./test-daniel.py ~/.irods/irods_environment.json session.server_version = (4, 2, 12) session.pool.account.authentication_scheme = 'pam' sys.version = '3.12.3 (main, Jan 17 2025, 18:03:48) [GCC 13.3.0]' type(list(session.pool.idle|session.pool.active)[0].socket) = <class 'ssl.SSLSocket'> Exception ignored in: <function Connection.__del__ at 0x7e6ae95fc900> Traceback (most recent call last): File "/some-directory/venv/lib/python3.12/site-packages/irods/connection.py", line 147, in __del__ self.disconnect() File "/some-directory/venv/lib/python3.12/site-packages/irods/connection.py", line 376, in disconnect self.send(disconnect_msg) File "/some-directory/venv/lib/python3.12/site-packages/irods/connection.py", line 163, in send raise NetworkException("Unable to send message") irods.exception.NetworkException: Unable to send message
hi @stsnel , Indeed - I was able to see the error message myself with these exact versions, except using Ubuntu 18.04. Presumably the OS doesn't matter as much. Thanks again!
Since the issue at hand seems solved in https://github.com/d-w-moore/python-irodsclient/tree/691.m, and @stsnel has merged a fix for #705 we'll consider for the time being no more work needs to be done here.