test_request_too_large[SSL] fails with NoHostAvailable
It started happening since move to python 3.14 docker image, and it affects only the SSL variant of this test.
Refs: https://github.com/scylladb/scylladb/issues/27300
From looking on detailed logs I see that there was empty system.peers results:
09:38:23,694 cassandra.cluster DEBUG dtest_setup.py :497 | [control connection] Removing host not found in peers metadata: <Host: 127.0.17.1:9042 datacenter1>
09:38:23,694 cassandra.cluster DEBUG dtest_setup.py :497 | [control connection] Finished fetching ring info
09:38:23,694 cassandra.cluster DEBUG dtest_setup.py :497 | [control connection] Rebuilding token map due to topology changes
09:38:23,698 cassandra.cluster DEBUG dtest_setup.py :497 | Control connection created
09:38:23,699 cassandra.pool DEBUG thread.py :73 | Initializing connection for host 127.0.17.1:9042
09:38:23,704 cassandra.connection DEBUG libevreactor.py:272 | Connection 140136824022800: '('127.0.0.1', 40786)' -> '('127.0.17.1', 9042)'
09:38:23,704 cassandra.connection DEBUG libevreactor.py:279 | Sending initial options message for new connection (140136824022800) to 127.0.17.1:9042
09:38:23,704 cassandra.protocol_features DEBUG protocol_features.py:62 | Parsing sharding info from message options {'COMPRESSION': ['lz4', 'snappy'], 'SCYLLA_LWT_ADD_METADATA_MARK': ['LWT_OPTIMIZATION_META_BIT_MASK=2147483648'], 'SCYLLA_NR_SHARDS': ['2'], 'SCYLLA_PARTITIONER': ['org.apache.cassandra.dht.Murmur3Partitioner'], 'SCYLLA_RATE_LIMIT_ERROR': ['ERROR_CODE=61440'], 'SCYLLA_SHARD': ['0'], 'SCYLLA_SHARDING_ALGORITHM': ['biased-token-round-robin'], 'SCYLLA_SHARDING_IGNORE_MSB': ['12'], 'SCYLLA_SHARD_AWARE_PORT': ['19042'], 'SCYLLA_USE_METADATA_ID': [''], 'TABLETS_ROUTING_V1': ['']}
09:38:23,704 cassandra.connection DEBUG libevreactor.py:376 | Received options response on new connection (140136824022800) from 127.0.17.1:9042
09:38:23,704 cassandra.connection ERROR libevreactor.py:376 | No available compression types supported on both ends. locally supported: odict_keys([]). remotely supported: ['lz4', 'snappy']
09:38:23,704 cassandra.connection DEBUG libevreactor.py:376 | Sending StartupMessage on <LibevConnection(140136824022800) 127.0.17.1:9042>
09:38:23,704 cassandra.connection DEBUG libevreactor.py:376 | Sent StartupMessage on <LibevConnection(140136824022800) 127.0.17.1:9042>
09:38:23,705 cassandra.connection DEBUG libevreactor.py:376 | Got ReadyMessage on new connection (140136824022800) from 127.0.17.1:9042
09:38:23,705 cassandra.pool DEBUG thread.py :73 | First connection created to 127.0.17.1:9042 for shard_id=0
09:38:23,705 cassandra.pool DEBUG thread.py :73 | Finished initializing connection for host 127.0.17.1:9042
09:38:23,705 cassandra.cluster DEBUG thread.py :73 | Added pool for host 127.0.17.1:9042 to session
09:38:23,705 cassandra.pool DEBUG thread.py :73 | shard_aware_endpoint=<DefaultEndPoint: 127.0.17.1:19042>
09:38:23,705 cassandra.cluster DEBUG dtest_setup.py :497 | Not starting MonitorReporter thread for Insights; not supported by server version 3.0.8 on ControlConnection host 127.0.17.1:9042
09:38:23,705 cassandra.cluster DEBUG dtest_setup.py :497 | Started Session with client_id df5bc982-cca4-48e9-8939-e24ddc034b46 and session_id c7e27a42-894c-4b1c-8529-fcd4ad07861a
09:38:23,706 cassandra.connection DEBUG libevreactor.py:272 | connection (140136824023120) port=62827 should be shard_id=1
09:38:23,706 limits_test INFO limits_test.py :199 | Trying to send a regular request to database (insert a string of regular size into table)...
09:38:23,712 cassandra.connection DEBUG libevreactor.py:272 | Connection 140136824023120: '('127.0.0.1', 62827)' -> '('127.0.17.1', 19042)'
09:38:23,713 cassandra.connection DEBUG libevreactor.py:279 | Sending initial options message for new connection (140136824023120) to 127.0.17.1:19042
09:38:23,716 cassandra.connection DEBUG libevreactor.py:333 | Defuncting connection (140136824022800) to 127.0.17.1:9042:
Traceback (most recent call last):
File "/usr/local/lib/python3.14/site-packages/cassandra/io/libevreactor.py", line 324, in handle_write
sent = self._socket.send(next_msg)
File "/usr/local/lib/python3.14/ssl.py", line 1232, in send
return self._sslobj.write(data)
~~~~~~~~~~~~~~~~~~^^^^^^
BrokenPipeError: [Errno 32] Broken pipe
09:38:23,716 cassandra.io.libevreactor DEBUG libevreactor.py:292 | Closing connection (140136824022800) to 127.0.17.1:9042
09:38:23,717 cassandra.io.libevreactor DEBUG libevreactor.py:296 | Closed socket to 127.0.17.1:9042
09:38:23,717 cassandra.pool DEBUG libevreactor.py:333 | Defunct or closed connection (140136824022800) returned to pool, potentially marking host 127.0.17.1:9042 as down
09:38:23,717 cassandra.pool DEBUG libevreactor.py:333 | Shutting down connections to 127.0.17.1:9042
09:38:23,717 cassandra.pool DEBUG libevreactor.py:333 | Closing connection (140136824022800) to 127.0.17.1:9042
09:38:23,717 cassandra.pool DEBUG libevreactor.py:333 | Closing pending connection (140136824023120) to 127.0.17.1:9042
09:38:23,717 cassandra.io.libevreactor DEBUG libevreactor.py:292 | Closing connection (140136824023120) to 127.0.17.1:19042
09:38:23,717 cassandra.io.libevreactor DEBUG libevreactor.py:296 | Closed socket to 127.0.17.1:19042
09:38:23,717 cassandra.cluster WARNING libevreactor.py:333 | Host 127.0.17.1:9042 has been marked down
09:38:23,717 cassandra.pool DEBUG thread.py :73 | Received a connection 140136824023120 for shard_id=1 on host 127.0.17.1:9042
09:38:23,717 cassandra.cluster DEBUG thread.py :73 | [control connection] Control connection host (127.0.17.1:9042) is considered down, starting reconnection
09:38:23,717 cassandra.pool DEBUG thread.py :73 | Pool for host 127.0.17.1:9042 is in shutdown, closing the new connection (140136824023120)
---------------------------- Captured log teardown -----------------------------
09:38:23,761 cassandra.cluster DEBUG conftest.py :322 | Shutting down Cluster Scheduler
09:38:23,785 cassandra.cluster DEBUG conftest.py :322 | Shutting down control connection
09:38:23,785 cassandra.io.libevreactor DEBUG libevreactor.py:292 | Closing connection (140136824018640) to 127.0.17.1:9042
09:38:23,785 cassandra.io.libevreactor DEBUG libevreactor.py:296 | Closed socket to 127.0.17.1:9042
09:38:23,786 cassandra.cluster DEBUG conftest.py :322 | Shutting down Cluster Scheduler
09:38:23,819 cassandra.cluster DEBUG conftest.py :322 | Shutting down control connection
09:38:23,819 cassandra.io.libevreactor DEBUG libevreactor.py:292 | Closing connection (140136824026320) to 127.0.17.1:9042
09:38:23,820 cassandra.io.libevreactor DEBUG libevreactor.py:296 | Closed socket to 127.0.17.1:9042
09:38:26,825 dtest_setup DEBUG dtest_setup.py :630 | exclude_errors: []
09:38:26,835 dtest_setup DEBUG dtest_setup.py :729 | removing ccm cluster test at: /home/sylwiaszunejko/.dtest/dtest-v0lgeo3u
09:38:26,844 dtest_setup DEBUG dtest_setup.py :732 | clearing ssl stores from [/home/sylwiaszunejko/.dtest/dtest-v0lgeo3u] directory
09:38:26,845 dtest_setup DEBUG dtest_setup.py :85 | Freeing cluster ID 17: link /home/sylwiaszunejko/.dtest/17
@sylwiaszunejko , please update master issue - https://github.com/scylladb/python-driver/issues/619, not here
it is also inline with what I observed in https://github.com/scylladb/python-driver/issues/616#issuecomment-3642916203
Actually I was to quick with jumping to the conclusion about this line, it is expected in this case due to the logic in python-driver that first when we resolving contact points it gets random host_id and then when we get proper id from the server the duplicate with wrong id needs to be removed
I ran into this a lot, like https://argus.scylladb.com/tests/scylla-cluster-tests/4db162e9-ee42-4fca-999e-74c6a6a31166 (last 5 runs) and https://argus.scylladb.com/tests/scylla-cluster-tests/021d804d-6f55-4cbe-87c0-cc5a64e04973 (last 2 runs)
Command: '/usr/bin/cqlsh --no-color --request-timeout=120 --connect-timeout=60 --ssl -e "describe keyspaces" 10.12.11.25'
Exit code: 1
Stdout:
Stderr:
Connection error: ('Unable to connect to any servers', {'10.12.11.25:9042': OperationTimedOut('errors=Timed out creating connection (60 seconds), last_host=None')})
But a quick test (https://argus.scylladb.com/tests/scylla-cluster-tests/e81a636c-1d78-4771-88fa-40c870f00dce) with configurations/disable_client_encrypt.yaml which make the commands not have the --ssl argument, the issue no longer happens.
< t:2025-12-18 15:49:52,312 f:remote_base.py l:650 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.12.3.94>: Running command "/usr/bin/cqlsh --no-color --request-timeout=120 --connect-timeout=60 -e "describe keyspaces" 10.12.3.94"...
< t:2025-12-18 15:49:54,815 f:base.py l:255 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.12.3.94>:
< t:2025-12-18 15:49:54,815 f:base.py l:255 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.12.3.94>: keyspace1 system_auth system_distributed_everywhere system_schema
< t:2025-12-18 15:49:54,815 f:base.py l:255 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.12.3.94>: system system_distributed system_replicated_keys system_traces
< t:2025-12-18 15:49:54,815 f:base.py l:255 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.12.3.94>:
< t:2025-12-18 15:49:54,816 f:base.py l:153 c:RemoteLibSSH2CmdRunner p:DEBUG > <10.12.3.94>: Command "/usr/bin/cqlsh --no-color --request-timeout=120 --connect-timeout=60 -e "describe keyspaces" 10.12.3.94" finished with status 0