neo4j-go-driver
neo4j-go-driver copied to clipboard
ConnectivityError: Unable to retrieve routing table from <neo4j_host>:7688: EOF with neo4j-go-driver 4.4.x & 5.x.x
We met this issue when upgrading neo4j from 4.3 to 4.4 and using driver 4.4.7 with neo4j protocol. We tried to update the driver to version 5.20.0 but the issue doesn't be fixed.
This issue makes system run unstable, it works but sometimes it returns the error Unable to retrieve routing table...
However, the old driver version 4.3.3 works fine (also for 4.3.8).
I assume there is a mistake configurable but I'm not sure. Please give me an idea.
System information:
- Neo4j version: Enterprise 4.4.31
- Neo4j Mode: HA cluster with 3 members
- Operating system: Amazon Linux 2
Hello, can you please share some Bolt traces in order for us to understand what's going on before and during that error?
Hello @fbiville ,
Sorry for my lateness. I sent you the bolt traces log. This log from the server received the request to respond to the result. The log has the prefix bolt_logger, which is log by bolt logger. bolt_logger.log
While the client gets the routing table, our code returns the error Unable to retrieve the routing table. Then the handshake happens after that. That is a racing condition so that happens with random frequency.
This is the sample code our code is using query cypher.
session := driver.NewSession(neo4j.SessionConfig{AccessMode: neo4j.AccessModeWrite, BoltLogger: boltLogger})
defer session.Close()
result, err := session.Run(query, params)
if err != nil {
return nil, err
}
I appreciate your support. I'm waiting for your response.
I have applied retry when met this issue around result, err := session.Run(query, params). I think it can work now. However, I feel confused about driver version 4.3.8, which works perfectly for me, with driver version 4.4.0 or above the issue happens. I read the change log of version 4.4.0 but didn't find any point to consider.
You can also run session.ExecuteRead or session.ExecuteWrite to leverage the built-in retries (in 4.x, they're called session.ReadTransaction and session.WriteTransaction).
The only case when you cannot use these APIs is when you use CALL {} IN TRANSACTIONS or (in Neo4j 4.x only) LOAD CSV [...] USING PERIODIC COMMIT.
I have the same error, subsequent calls to the driver after this error no longer work, I am forced to restart the driver
I am getting the same error with Go v.1.22.4, driver v.5.20.0 and database v.5.20.0 enterprise
"TransactionExecutionLimit: timeout (exceeded max retry time: 30s) after 6 attempts, last error: ConnectivityError: Unable to retrieve routing table from
After the driver returns an error the code closes the connection and opens a new connection for further use. The code kind of recovers, but the loading of data get delayed, which is very annoying.
@tkandal did you figure out a solution for fixing the issue yet?
Kind of found a solution. The config.MaxTransactionRetryTime is default 30 seconds, increased it to 60 seconds and then it seems to work much better.
@fbiville the usage around session.ExecuteWrite is a bit confusing to me. It's not clear if it's safe to always use this, or if it should only be used in situations where the cmd is "idempodent" (see also this article).
Could there be a situation where the cmd was executed, but the connection timed out on the reply side, thus resulting in duplicate cmd execution if we use this func with auto retry? For now, I've only taken advantage of session.ExecuteRead out of caution. Are there some docs I'm overlooking around this?
@bradleygore ExecuteWrite should only be used with idempotent queries. If it retries due to a transient error of some kind then the second (or more) runs should not have a differing outcome to the first. If your query is not idempotent then there are two options.
- Make use of the session.Run interface which is autocommit with the transactions being handled server side.
- Manually handle the transactions yourself including error handling and potential retry mechanics.
Any errors will have to be handled by yourself with these as they do not retry, and with an non idempotent query you would also have to take into account the state the DB is in.
Hey @AndyHeap-NeoTech - that's what I was thinking. How's that compare to the driver.ExecuteQuery func - from the docs it looks like that's recommended for both reads and writes?