neo4j-python-driver
neo4j-python-driver copied to clipboard
verify_connectivity() doesn't always catch errors
In the LangChain integration, we use the following block of code:
try:
self._driver.verify_connectivity()
except neo4j.exceptions.ServiceUnavailable:
raise ValueError(
"Could not connect to Neo4j database. "
"Please ensure that the url is correct"
)
except neo4j.exceptions.AuthError:
raise ValueError(
"Could not connect to Neo4j database. "
"Please ensure that the username and password are correct"
)
# Set schema
try:
self.refresh_schema() # Does read transactions
except neo4j.exceptions.ClientError:
raise ValueError(
"Could not use APOC procedures. "
"Please ensure the APOC plugin is installed in Neo4j and that "
"'apoc.meta.data()' is allowed in Neo4j configuration "
)
Every now and then it happens that the verify_connectivity
doesn't catch any error, and the error only appears when we actually do read transactions in the refresh_schema
method. This is a bit confusing for the users, but I can't give you a reproducible example when does this happen
Here is one example of an error that isn't being caught:
ClientError: {code: Neo.ClientError.Database.DatabaseNotFound} {message: Unable to get a routing table for database 'neo4j' because this database does not exist}
Hi and thank you for reaching out!
As a primer: there's a discussion to be had about which exact connectivity this function should verify. Here are some options:
- Just the TCP connection to the server/cluster
- To the DBMS itself
- To the user space (login and such)
- To a database hosted on the DBMS. In this case: which one? The user's home databse? The always existing
system
database? Another one?
Currently, it does the latter option: it tries to connect to the user's home database.
Therefore, I'm actually surprised that Neo.ClientError.Database.DatabaseNotFound
does not get raised. I couldn't reproduce it either. Using
- neo4j v5.15 (dockerized)
- Python v3.11
- neo4j Python driver from
5.x
(think dev) branch, pretty much identical to v5.17.
I spin up a fresh DBMS, run DROP DATABASE neo4j
, and then
import asyncio
import neo4j
from neo4j.debug import watch
watch("neo4j") # driver debug logging
URL = "neo4j://localhost:7687"
AUTH = ("neo4j", "pass")
async def main() -> None:
async with neo4j.AsyncGraphDatabase.driver(URL, auth=AUTH) as driver:
await driver.verify_connectivity()
if __name__ == '__main__':
asyncio.run(main())
The fact that this is async doesn't change a thing regarding the handling of server errors.
And here's the output of that:
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,715 [#0000] _: <POOL> created, routing address IPv4Address(('localhost', 7687))
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,715 [#0000] _: <WORKSPACE> resolve home database
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,716 [#0000] _: <POOL> attempting to update routing table from IPv4Address(('localhost', 7687))
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,716 [#0000] _: <RESOLVE> in: localhost:7687
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,720 [#0000] _: <RESOLVE> dns resolver out: 127.0.0.1:7687
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,720 [#0000] _: <POOL> _acquire router connection, database=None, address=ResolvedIPv4Address(('127.0.0.1', 7687))
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,720 [#0000] _: <POOL> trying to hand out new connection
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,720 [#0000] C: <OPEN> 127.0.0.1:7687
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,726 [#D8E0] C: <MAGIC> 0x6060B017
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,726 [#D8E0] C: <HANDSHAKE> 0x00040405 0x00020404 0x00000104 0x00000003
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,728 [#D8E0] S: <HANDSHAKE> 0x00000405
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,728 [#D8E0] C: HELLO {'user_agent': 'neo4j-python/5.17.dev0 Python/3.11.6-final-0 (linux)', 'routing': {'address': 'localhost:7687'}, 'bolt_agent': {'product': 'neo4j-python/5.17.dev0', 'platform': 'Linux 5.15.0-94-generic; x86_64', 'language': 'Python/3.11.6-final-0', 'language_details': 'CPython; 3.11.6-final-0 (main, Oct 4 2023 09:29:57) [GCC 11.4.0]'}}
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,728 [#D8E0] _: <CONNECTION> client state: CONNECTED > AUTHENTICATION
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,728 [#D8E0] C: LOGON {'scheme': 'basic', 'principal': 'neo4j', 'credentials': '*******'}
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,728 [#D8E0] _: <CONNECTION> client state: AUTHENTICATION > READY
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,731 [#D8E0] S: SUCCESS {'server': 'Neo4j/5.15.0', 'connection_id': 'bolt-34', 'hints': {'connection.recv_timeout_seconds': 120, 'telemetry.enabled': True}}
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,731 [#D8E0] _: <CONNECTION> server state: CONNECTED > AUTHENTICATION
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,733 [#D8E0] S: SUCCESS {}
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,733 [#D8E0] _: <CONNECTION> server state: AUTHENTICATION > READY
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,733 [#D8E0] C: ROUTE {'address': 'localhost:7687'} () {}
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,735 [#D8E0] S: FAILURE {'code': 'Neo.ClientError.Database.DatabaseNotFound', 'message': "Unable to get a routing table for database 'neo4j' because this database does not exist"}
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,735 [#D8E0] C: RESET
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,737 [#D8E0] S: SUCCESS {}
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,737 [#D8E0] _: <CONNECTION> server state: FAILED > READY
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,737 [#D8E0] _: <POOL> released bolt-34
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,738 [#0000] _: <POOL> close
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,738 [#D8E0] C: GOODBYE
[DEBUG ] [Thread 140403322895232] [Task 140403300631952] 2024-02-08 09:11:09,739 [#D8E0] C: <CLOSE>
Traceback (most recent call last):
File "/home/rouven/.config/JetBrains/PyCharm2023.2/scratches/scratch_38.py", line 33, in <module>
asyncio.run(main())
File "/home/rouven/.pyenv/versions/3.11.6/lib/python3.11/asyncio/runners.py", line 190, in run
return runner.run(main)
^^^^^^^^^^^^^^^^
File "/home/rouven/.pyenv/versions/3.11.6/lib/python3.11/asyncio/runners.py", line 118, in run
return self._loop.run_until_complete(task)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rouven/.pyenv/versions/3.11.6/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
return future.result()
^^^^^^^^^^^^^^^
File "/home/rouven/.config/JetBrains/PyCharm2023.2/scratches/scratch_38.py", line 15, in main
await driver.verify_connectivity()
File "/home/rouven/Neo4j/code/neo4j-python-driver/src/neo4j/_async/driver.py", line 1008, in verify_connectivity
await self._get_server_info(session_config)
File "/home/rouven/Neo4j/code/neo4j-python-driver/src/neo4j/_async/driver.py", line 1219, in _get_server_info
return await session._get_server_info()
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rouven/Neo4j/code/neo4j-python-driver/src/neo4j/_async/work/session.py", line 172, in _get_server_info
await self._connect(READ_ACCESS, liveness_check_timeout=0)
File "/home/rouven/Neo4j/code/neo4j-python-driver/src/neo4j/_async/work/session.py", line 130, in _connect
await super()._connect(
File "/home/rouven/Neo4j/code/neo4j-python-driver/src/neo4j/_async/work/workspace.py", line 161, in _connect
await self._pool.update_routing_table(
File "/home/rouven/Neo4j/code/neo4j-python-driver/src/neo4j/_async/io/_pool.py", line 779, in update_routing_table
if await self._update_routing_table_from(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rouven/Neo4j/code/neo4j-python-driver/src/neo4j/_async/io/_pool.py", line 725, in _update_routing_table_from
new_routing_table = await self.fetch_routing_table(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rouven/Neo4j/code/neo4j-python-driver/src/neo4j/_async/io/_pool.py", line 662, in fetch_routing_table
new_routing_info = await self.fetch_routing_info(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rouven/Neo4j/code/neo4j-python-driver/src/neo4j/_async/io/_pool.py", line 632, in fetch_routing_info
routing_table = await cx.route(
^^^^^^^^^^^^^^^
File "/home/rouven/Neo4j/code/neo4j-python-driver/src/neo4j/_async/io/_bolt5.py", line 196, in route
await self.fetch_all()
File "/home/rouven/Neo4j/code/neo4j-python-driver/src/neo4j/_async/io/_bolt.py", line 863, in fetch_all
detail_delta, summary_delta = await self.fetch_message()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rouven/Neo4j/code/neo4j-python-driver/src/neo4j/_async/io/_bolt.py", line 849, in fetch_message
res = await self._process_message(tag, fields)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/rouven/Neo4j/code/neo4j-python-driver/src/neo4j/_async/io/_bolt5.py", line 369, in _process_message
await response.on_failure(summary_metadata or {})
File "/home/rouven/Neo4j/code/neo4j-python-driver/src/neo4j/_async/io/_common.py", line 245, in on_failure
raise Neo4jError.hydrate(**metadata)
neo4j.exceptions.ClientError: {code: Neo.ClientError.Database.DatabaseNotFound} {message: Unable to get a routing table for database 'neo4j' because this database does not exist}
Process finished with exit code 1
neo4j.exceptions.ClientError: {code: Neo.ClientError.Database.DatabaseNotFound} {message: Unable to get a routing table for database 'neo4j' because this database does not exist}
Could you please help me reproduce the issue where verify_connectivity()
does not catch trying to connect to a database that doesn't exist? Or any other error you think it should catch.
Here is the full traceback, not sure if it is helpful as that's the error from refresh_schema and not the connectivity method.
---------------------------------------------------------------------------
ClientError Traceback (most recent call last)
File ~/.local/lib/python3.10/site-packages/langchain/graphs/neo4j_graph.py:65, in Neo4jGraph.__init__(self, url, username, password, database)
64 try:
---> 65 self.refresh_schema()
66 except neo4j.exceptions.ClientError:
File ~/.local/lib/python3.10/site-packages/langchain/graphs/neo4j_graph.py:88, in Neo4jGraph.refresh_schema(self)
85 """
86 Refreshes the Neo4j graph schema information.
87 """
---> 88 node_properties = [el["output"] for el in self.query(node_properties_query)]
89 rel_properties = [el["output"] for el in self.query(rel_properties_query)]
File ~/.local/lib/python3.10/site-packages/langchain/graphs/neo4j_graph.py:79, in Neo4jGraph.query(self, query, params)
78 try:
---> 79 data = session.run(query, params)
80 return [r.data() for r in data]
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/neo4j/_sync/work/session.py:302, in Session.run(self, query, parameters, **kwargs)
301 if not self._connection:
--> 302 self._connect(self._config.default_access_mode)
303 assert self._connection is not None
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/neo4j/_sync/work/session.py:130, in Session._connect(self, access_mode, **acquire_kwargs)
129 try:
--> 130 super()._connect(
131 access_mode, auth=self._config.auth, **acquire_kwargs
132 )
133 except asyncio.CancelledError:
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/neo4j/_sync/work/workspace.py:178, in Workspace._connect(self, access_mode, auth, **acquire_kwargs)
177 acquire_kwargs_.update(acquire_kwargs)
--> 178 self._connection = self._pool.acquire(**acquire_kwargs_)
179 self._connection_access_mode = access_mode
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/neo4j/_sync/io/_pool.py:912, in Neo4jPool.acquire(self, access_mode, timeout, database, bookmarks, auth, liveness_check_timeout)
910 log.debug("[#0000] _: <POOL> acquire routing connection, "
911 "access_mode=%r, database=%r", access_mode, database)
--> 912 self.ensure_routing_table_is_fresh(
913 access_mode=access_mode, database=database,
914 imp_user=None, bookmarks=bookmarks, auth=auth,
915 acquisition_timeout=timeout
916 )
918 while True:
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/neo4j/_sync/io/_pool.py:854, in Neo4jPool.ensure_routing_table_is_fresh(self, access_mode, database, imp_user, bookmarks, auth, acquisition_timeout, database_callback)
852 return False
--> 854 self.update_routing_table(
855 database=database, imp_user=imp_user, bookmarks=bookmarks,
856 auth=auth, acquisition_timeout=acquisition_timeout,
857 database_callback=database_callback
858 )
859 self.update_connection_pool(database=database)
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/neo4j/_sync/io/_pool.py:776, in Neo4jPool.update_routing_table(self, database, imp_user, bookmarks, auth, acquisition_timeout, database_callback)
774 if prefer_initial_routing_address:
775 # TODO: Test this state
--> 776 if self._update_routing_table_from(
777 self.address, database=database,
778 imp_user=imp_user, bookmarks=bookmarks, auth=auth,
779 acquisition_timeout=acquisition_timeout,
780 database_callback=database_callback
781 ):
782 # Why is only the first initial routing address used?
783 return
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/neo4j/_sync/io/_pool.py:722, in Neo4jPool._update_routing_table_from(self, database, imp_user, bookmarks, auth, acquisition_timeout, database_callback, *routers)
719 for address in NetworkUtil.resolve_address(
720 router, resolver=self.pool_config.resolver
721 ):
--> 722 new_routing_table = self.fetch_routing_table(
723 address=address, acquisition_timeout=acquisition_timeout,
724 database=database, imp_user=imp_user, bookmarks=bookmarks,
725 auth=auth
726 )
727 if new_routing_table is not None:
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/neo4j/_sync/io/_pool.py:659, in Neo4jPool.fetch_routing_table(self, address, acquisition_timeout, database, imp_user, bookmarks, auth)
658 try:
--> 659 new_routing_info = self.fetch_routing_info(
660 address, database, imp_user, bookmarks, auth,
661 acquisition_timeout
662 )
663 except Neo4jError as e:
664 # checks if the code is an error that is caused by the client. In
665 # this case there is no sense in trying to fetch a RT from another
666 # router. Hence, the driver should fail fast during discovery.
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/neo4j/_sync/io/_pool.py:629, in Neo4jPool.fetch_routing_info(self, address, database, imp_user, bookmarks, auth, acquisition_timeout)
628 try:
--> 629 routing_table = cx.route(
630 database=database or self.workspace_config.database,
631 imp_user=imp_user or self.workspace_config.impersonated_user,
632 bookmarks=bookmarks
633 )
634 finally:
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/neo4j/_sync/io/_bolt4.py:534, in Bolt4x4.route(self, database, imp_user, bookmarks, dehydration_hooks, hydration_hooks)
533 self.send_all()
--> 534 self.fetch_all()
535 return [metadata.get("rt")]
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/neo4j/_sync/io/_bolt.py:863, in Bolt.fetch_all(self)
862 while not response.complete:
--> 863 detail_delta, summary_delta = self.fetch_message()
864 detail_count += detail_delta
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/neo4j/_sync/io/_bolt.py:849, in Bolt.fetch_message(self)
846 tag, fields = self.inbox.pop(
847 hydration_hooks=self.responses[0].hydration_hooks
848 )
--> 849 res = self._process_message(tag, fields)
850 self.idle_since = monotonic()
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/neo4j/_sync/io/_bolt4.py:368, in Bolt4x0._process_message(self, tag, fields)
367 try:
--> 368 response.on_failure(summary_metadata or {})
369 except (ServiceUnavailable, DatabaseUnavailable):
File ~/anaconda3/envs/python3/lib/python3.10/site-packages/neo4j/_sync/io/_common.py:245, in Response.on_failure(self, metadata)
244 Util.callback(handler)
--> 245 raise Neo4jError.hydrate(**metadata)
ClientError: {code: Neo.ClientError.Database.DatabaseNotFound} {message: Unable to get a routing table for database 'neo4j' because this database does not exist}
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
Cell In[71], line 3
1 from langchain.graphs import Neo4jGraph
----> 3 graph = Neo4jGraph(
4 url=NEO4J_URI,
5 username=NEO4J_USERNAME,
6 password=NEO4J_PASSWORD
7 )
File ~/.local/lib/python3.10/site-packages/langchain/graphs/neo4j_graph.py:67, in Neo4jGraph.__init__(self, url, username, password, database)
65 self.refresh_schema()
66 except neo4j.exceptions.ClientError:
---> 67 raise ValueError(
68 "Could not use APOC procedures. "
69 "Please ensure the APOC plugin is installed in Neo4j and that "
70 "'apoc.meta.data()' is allowed in Neo4j configuration "
71 )
ValueError: Could not use APOC procedures. Please ensure the APOC plugin is installed in Neo4j and that 'apoc.meta.data()' is allowed in Neo4j configuration
This is quite interesting. It also seems you can reproduce the error. Could you please enable debug logging in the driver and post the full logs (including the passing verify_connectivity()
call)? See the docs on how to enable it.
@tomasonjo gentle reminder, can you please provide the driver debug logs to enable me to further investigate the issue? Or better yet, help we reproduce the issue. What exactly did you do to trigger it?
Hey @robsdedude . I couldn't reproduce it locally, I just got a full debug from the user who reported the issue. Could it be something with firewall permissions, since they were at a customer with strict security?
The only think I could imagine is something along the lines of the user running multiple DBMSs and having a single DNS entry point at both (or a disjoint/misconfigured cluster or something like it) so that it's up to chance whether the driver succeeds or fails. I definitely need more information to be able to make a call if this is really a driver bug or if this is expected behavior and the problem lies somewhere else.
As written, I'm afraid I won't be able to dig further into the issue without extra info. Therefore, I'll close the issue. We can always reopen later if new information is available.