incubator-hugegraph
incubator-hugegraph copied to clipboard
[Improvement] Mysql 8.x 会话超时,因不再支持 autoReconnect 参数导致超时自动中断
description
使用Mysql作为存储,当连接超过 wait_timeout 时,没法继续写入数据。在Mysq5中,不再支持autoReconnect=true。 当前hugegraph中,是否存在连接没有关闭的情况?
Environment ( 环境信息 - 必填 )
- Server Version: 0.11.2
- Backend: MySQL 8.0.16-X-Cluster-1.0.2
- OS: Linux version 4.18.0-147.5.1.el8_1.x86_64 ([email protected]) (gcc version 8.3.1 20190507 (Red Hat 8.3.1-4) (GCC))
- Data Size: xx vertices, xx edges
Expected behavior ( 期望表现 )
日常正常使用下,数据能成功呢添加。
Actual behavior ( 实际表现 / 报错)
数据库连接丢失。
异常日志
2021-05-20 00:41:14 93889862 [grizzly-http-server-11] [ERROR] com.baidu.hugegraph.server.RestServer [] - Failed to commit com.baidu.hugegraph.backend.BackendException: Failed to open transaction at com.baidu.hugegraph.backend.store.mysql.MysqlStore.beginTx(MysqlStore.java:292) ~[hugegraph-mysql-0.11.2.jar:?] at com.baidu.hugegraph.backend.tx.AbstractTransaction.commitMutation2Backend(AbstractTransaction.java:304) ~[hugegraph-core-0.11.2.jar:0.11.2.0] at com.baidu.hugegraph.backend.cache.CachedGraphTransaction.commitMutation2Backend(CachedGraphTransaction.java:292) ~[hugegraph-core-0.11.2.jar:0.11.2.0] at com.baidu.hugegraph.backend.tx.IndexableTransaction.commit2Backend(IndexableTransaction.java:57) ~[hugegraph-core-0.11.2.jar:0.11.2.0] at com.baidu.hugegraph.backend.tx.AbstractTransaction.commit(AbstractTransaction.java:223) ~[hugegraph-core-0.11.2.jar:0.11.2.0] at com.baidu.hugegraph.backend.tx.GraphTransaction.commit(GraphTransaction.java:493) ~[hugegraph-core-0.11.2.jar:0.11.2.0] at com.baidu.hugegraph.StandardHugeGraph$Txs.commit(StandardHugeGraph.java:1237) ~[hugegraph-core-0.11.2.jar:0.11.2.0] at com.baidu.hugegraph.StandardHugeGraph$TinkerPopTransaction.doCommit(StandardHugeGraph.java:1117) ~[hugegraph-core-0.11.2.jar:0.11.2.0] at org.apache.tinkerpop.gremlin.structure.util.AbstractTransaction.commit(AbstractTransaction.java:104) ~[gremlin-core-3.4.3.jar:3.4.3] at com.baidu.hugegraph.StandardHugeGraph$TinkerPopTransaction.commit(StandardHugeGraph.java:1083) ~[hugegraph-core-0.11.2.jar:0.11.2.0] at com.baidu.hugegraph.api.API.commit(API.java:97) [hugegraph-api-0.11.2.jar:0.58.0.0] at com.baidu.hugegraph.api.graph.VertexAPI.create(VertexAPI.java:94) [hugegraph-api-0.11.2.jar:0.58.0.0] at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source) ~[?:?] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_275] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_275] at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81) [jersey-server-2.25.1.jar:?] at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:144) [jersey-server-2.25.1.jar:?] at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:161) [jersey-server-2.25.1.jar:?] at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:205) [jersey-server-2.25.1.jar:?] at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:99) [jersey-server-2.25.1.jar:?] at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:389) [jersey-server-2.25.1.jar:?] at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:347) [jersey-server-2.25.1.jar:?] at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:102) [jersey-server-2.25.1.jar:?] at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:326) [jersey-server-2.25.1.jar:?] at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) [jersey-common-2.25.1.jar:?] at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) [jersey-common-2.25.1.jar:?] at org.glassfish.jersey.internal.Errors.process(Errors.java:315) [jersey-common-2.25.1.jar:?] at org.glassfish.jersey.internal.Errors.process(Errors.java:297) [jersey-common-2.25.1.jar:?] at org.glassfish.jersey.internal.Errors.process(Errors.java:267) [jersey-common-2.25.1.jar:?] at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317) [jersey-common-2.25.1.jar:?] at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305) [jersey-server-2.25.1.jar:?] at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154) [jersey-server-2.25.1.jar:?] at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service(GrizzlyHttpContainer.java:384) [jersey-container-grizzly2-http-2.25.1.jar:?] at org.glassfish.grizzly.http.server.HttpHandler$1.run(HttpHandler.java:200) [grizzly-http-server-2.4.4.jar:2.4.4] at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:569) [grizzly-framework-2.4.4.jar:2.4.4] at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:549) [grizzly-framework-2.4.4.jar:2.4.4] at java.lang.Thread.run(Thread.java:748) [?:1.8.0_275] Caused by: com.mysql.cj.jdbc.exceptions.CommunicationsException: The last packet successfully received from the server was 8,873,663 milliseconds ago. The last packet sent successfully to the server was 8,873,666 milliseconds ago. is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem. at com.mysql.cj.jdbc.exceptions.SQLError.createCommunicationsException(SQLError.java:174) ~[mysql-connector-java-8.0.19.jar:8.0.19] at com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:64) ~[mysql-connector-java-8.0.19.jar:8.0.19] at com.mysql.cj.jdbc.ConnectionImpl.setAutoCommit(ConnectionImpl.java:2056) ~[mysql-connector-java-8.0.19.jar:8.0.19] at com.baidu.hugegraph.backend.store.mysql.MysqlSessions$Session.begin(MysqlSessions.java:381) ~[hugegraph-mysql-0.11.2.jar:?] at com.baidu.hugegraph.backend.store.mysql.MysqlStore.beginTx(MysqlStore.java:290) ~[hugegraph-mysql-0.11.2.jar:?] ... 36 more
看到你使用的是 mysql8-cluster 的版本, 目前适配的应该只是 5.x, 后续的兼容性欢迎反馈.
不过你说的回话超时导致无法写入, 跟之前 issue 区有提到的自动超时没有重连是一个问题么 (参考 #561)? 是说长时间没有跟 mysql 发送请求后链接断开了还是其他意思. (store.connection_detect_interval 参数配置后可以解决么)
好的,我尝试添加store.connection_detect_interval。
Caused by: com.mysql.cj.jdbc.exceptions.CommunicationsException: The last packet successfully received from the server was 8,873,663 milliseconds ago. The last packet sent successfully to the server was 8,873,666 milliseconds ago. is longer than the server configured value of 'wait_timeout'. You should consider either expiring and/or testing connection validity before use in your application, increasing the server configured values for client timeouts, or using the Connector/J connection property 'autoReconnect=true' to avoid this problem.
具体报错如上,报错信息建议添加'autoReconnect=true' ,但MySQL5开始,不再支持该参数。
感谢反馈, 那你有没有去 MySQL 官方 issue 或者类似搜索过, autoRecoonect 参数被替换为什么, 或者是取消了类似的问题在高版本如何解决的呢?
我想官方取消参数一般会给出新的解决方案吧
- 配置
store.connection_detect_interval=60,不能解决。 - Mysql5之后的版本,
autoRecoonect参数是无效了。目前我所知的解决办法有2种:- 通过mysql的
wait_timeout增长会话时间(治标不治本 - 后台服务调用MYSQL的时候,会话需要在一段时间后自动关闭(时间要少于
wait_timeout)
- 通过mysql的
配置
store.connection_detect_interval=60,不能解决。Mysql5 之后的版本,
autoRecoonect参数是无效了。目前我所知的解决办法有 2 种:
- 通过 mysql 的
wait_timeout增长会话时间(治标不治本- 后台服务调用 MYSQL 的时候,会话需要在一段时间后自动关闭(时间要少于
wait_timeout)
多谢反馈, 那高版自动重连失效之后, 官方也没有给相关的解决方案么, 你后面提到的两个感觉都是用户侧需要做的改动, 还可能带来其他的影响.
mysql官网给出的理由是:不建议使用autoReconnect选项,因为没有安全的方法重新连接到MySQL服务器,而不冒损坏连接状态或数据库状态信息的风险。相反,使用连接池,这将使您的应用程序能够使用池中的可用连接连接到MySQL服务器。所以autoReconnect功能已弃用。
"当前hugegraph中,是否存在连接没有关闭的情况?"——关闭有close方法,就是不知道在哪里调用。可以看一下mysql模块的源码,十个类很简洁。
我对这个问题的看法:没有实现真正的连接池管理,只是在线程与connection之间做了个所谓的session绑定。所以当请求挤压时,线程和连接都会暴增,直到内存爆破。
个人建议:官方团队应当考虑引入第三方成熟的connection pool,而没必要重新发明轮子。
"当前 hugegraph 中,是否存在连接没有关闭的情况?"—— 关闭有 close 方法,就是不知道在哪里调用。可以看一下 mysql 模块的源码,十个类很简洁。
我对这个问题的看法:没有实现真正的连接池管理,只是在线程与 connection 之间做了个所谓的 session 绑定。所以当请求挤压时,线程和连接都会暴增,直到内存爆破。
个人建议:官方团队应当考虑引入第三方成熟的 connection pool,而没必要重新发明轮子。
你说引入类似 c3p0 那种连接框架么? 多谢建议, 后续优化 mysql 会调整, 也欢迎贡献 PR
这个问题,我这边尝试增加一个自定义的参数,用来对应自己mysql的超时时间 基本思想就是,当判断这个session超时,就关闭当前这个超时的session(我现在是简单的置空了,好像会照成内存泄漏,需要请教怎么正确的关闭),然后重新新增一个session,改成这样就没有发现报上面的那个错误。 public static final ConfigOption<Long> CONNECTION_WAIT_TIMEOUT = new ConfigOption<>( "store.connection_wait_timeout", "The interval in seconds for connects timeout, " + "if the idle time of a connection exceeds this value, " + "detect it and create new session before using, " + "value 0 means detecting every time.", rangeInt(0L, Long.MAX_VALUE), 28800L );
然后改写了BackendSessionPool的几个方法 public final BackendSession getOrNewSession() { BackendSession session = this.threadLocalSession.get(); if (session == null) { return createNewSession(); } else { session = createSessionIfWaitTimeout(session); this.detectSession(session); } return session; }
private BackendSession createNewSession(){ BackendSession session = this.newSession(); assert session != null; this.threadLocalSession.set(session); assert !this.sessions.containsKey(Thread.currentThread().getId()); this.sessions.put(Thread.currentThread().getId(), session); int sessionCount = this.sessionCount.incrementAndGet(); LOG.debug("Now(after connect({})) session count is: {}", this, sessionCount); return session; }
private BackendSession createSessionIfWaitTimeout(BackendSession session) { // create new session if the session idle time exceed wait timeout specified value long connectionWaitTimeout = TimeUnit.SECONDS.toMillis(this.connectionWaitTimeout); long now = System.currentTimeMillis(); if(now - session.updated() > connectionWaitTimeout){ session = null;//这个地方这样处理,好像会照成内存泄漏,所以想请教如何正确的关闭之前那个已经超时的连接。 return createNewSession(); } return session; }
maybe related to: https://github.com/apache/incubator-hugegraph/pull/1672