tx-lcn
tx-lcn copied to clipboard
5.0.1整合出现空指针异常问题
5.0.1空指针异常问题
java.lang.NullPointerException: null
at com.codingapi.txlcn.tc.core.checking.DefaultDTXExceptionHandler.handleNotifyGroupBusinessException(DefaultDTXExceptionHandler.java:97)
at com.codingapi.txlcn.tc.core.template.TransactionControlTemplate.notifyGroup(TransactionControlTemplate.java:159)
at com.codingapi.txlcn.tc.core.transaction.lcn.control.LcnStartingTransaction.postBusinessCode(LcnStartingTransaction.java:65)
at com.codingapi.txlcn.tc.core.DTXServiceExecutor.transactionRunning(DTXServiceExecutor.java:109)
at com.codingapi.txlcn.tc.aspect.weave.DTXLogicWeaver.runTransaction(DTXLogicWeaver.java:95)
at com.codingapi.txlcn.tc.aspect.TransactionAspect.runWithLcnTransaction(TransactionAspect.java:93)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethodWithGivenArgs(AbstractAspectJAdvice.java:629)
at org.springframework.aop.aspectj.AbstractAspectJAdvice.invokeAdviceMethod(AbstractAspectJAdvice.java:618)
at org.springframework.aop.aspectj.AspectJAroundAdvice.invoke(AspectJAroundAdvice.java:70)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:168)
at org.springframework.aop.interceptor.ExposeInvocationInterceptor.invoke(ExposeInvocationInterceptor.java:92)
该异常时不时出现,debug进去发现是由于发生了死锁问题。(不是必现)
- org.hibernate.exception.LockTimeoutException: could not execute statement
- java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
麻烦作者查阅后协助排查一下,看是什么问题导致的,如果需要什么信息请回复本issue。
TM组件不兼容
在尝试不用的TM组件发现,不同版本的TC无法注册到不同的TM上,比如5.0.1版本的TC无法注册到5.0.2版本的TM上。
请问是否有平滑升级TC/TM 组件的方案?
5.02 也出现这个问题。
我也遇到了,同样5.02
我也遇到了,楼主解决了吗?
请问楼主解决了么,我也遇到这个问题了
问下楼主解决了吗?
5.02版本也遇到了,请问楼主解决了吗?
测试发现貌似,调试时间超过4-5s后就会出现改控制正问题
统一回复一下:问题的空指针,检查一下切面(资源切面跟注解切面的配置)跟事务切面的关系,排查一下可以解决。
问题 2升级问题暂时没有解决方案
是的,程序调试的时候将 分布式事务执行总时间(ms, 默认为36000),进行适当的增加可以避免调试时候的NullPointerException=》tx-lcn.manager.dtx-time=24000,生成环境进行量测可得出恰当的时间设定
今天也遇到了这个问题,我使用的是 5.0.2 版本,采用的分布式事务是:
@LcnTransaction(propagation = DTXPropagation.REQUIRED)
@LcnTransaction(propagation = DTXPropagation.SUPPORTS)
跟踪源码发现我这边是由于分布式事务超时导致的空指针异常,问题分析及解决办法:
tx-lcn:
manager:
# 分布式事务超时时间(ms),需要大于[(微服务调用链长度 e) * (hystrix 超时时间) + N(多次跨服务调用))时间]
# 否则会因为超时导致 com.codingapi.txlcn.tc.core.checking.DefaultDTXExceptionHandler.handleNotifyGroupBusinessException 的Throwable ex 参数为空,
# 从而导致抛空指针异常,从而导致“结束事务”没执行,事务没结束,导致后续的请求一直卡在那里,即使接收的服务重启也没用
# 异常重现步骤:
# 1、接收服务 B ,设置 ribbon.ReadTimeout 为 10 秒,接收接口 Thread.sleep(9 * 1000) 9 秒
# 1、启动请求服务 A,启动接收服务 B
# 2、确保两个服务均已启动成功,且成功注册到Eureka ,且 spring 的 Gateway 网关可以正常转发请求
# 3、服务 A 发起请求,通过 FeignClient 调用服务 B 的接口
# 4、此时该接口的等待时间会超过 8 秒(分布式事务默认超时时间为 8 秒),从而导致分布式事务超时
# 5、通过跟踪分布式事务超时处理源码
# TransactionControlTemplate.notifyGroup {
# if (globalContext.isDTXTimeout()) {
# throw new LcnBusinessException("dtx timeout.");
# }
# ...
# catch (LcnBusinessException e) {
# // 此时的 e.getCause() 是 null ,会导致 dtxExceptionHandler.handleNotifyGroupBusinessException 抛空指针异常
# dtxExceptionHandler.handleNotifyGroupBusinessException(Arrays.asList(groupId, state, unitId, transactionType), e.getCause());
# }
# }
#
# DefaultDTXExceptionHandler.handleNotifyGroupBusinessException {
# ...
# if ((ex.getCause() != null && ex.getCause() instanceof UserRollbackException)) // 此段代码抛空指针异常
# ...
# transactionCleanTemplate.clean(groupId, unitId, transactionType, state); // 导致本段代码无法执行,无法正常结束事务
# }
# 6、此时就会出现分布式事务超时异常,从而导致分布式事务无法正常结束,然后后续的所有请求都会卡在这,一直报这个异常,即使服务 B 重启也没用,需要重启服务 A 才可以
#
# 由于需要设置的(分布式事务超时时间(ms))无法直接确定确定,因此可进一步优化:
# 1、设置(分布式事务超时时间(ms))大于等于 [1 * (hystrix 超时时间)]
# 2、重写 DefaultDTXExceptionHandler.handleNotifyGroupBusinessException 对 ex 做空判断,确保事务能正常结束,然后抛出分布式事物超时异常,以便获知是分布式事务超时的情况
# 3、确保重写的 DefaultDTXExceptionHandler 优于 txlcn-tc-5.0.2.RELEASE.jar 被加载
dtx-time: 35000
重写代码如下:
/*
* Copyright 2017-2019 CodingApi .
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package com.codingapi.txlcn.tc.core.checking;
import com.codingapi.txlcn.common.exception.*;
import com.codingapi.txlcn.logger.TxLogger;
import com.codingapi.txlcn.tc.txmsg.TMReporter;
import com.codingapi.txlcn.tc.core.template.TransactionCleanTemplate;
import com.codingapi.txlcn.txmsg.params.TxExceptionParams;
import lombok.extern.slf4j.Slf4j;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.stereotype.Component;
import java.util.List;
/**
* Description:
* 重写 handleNotifyGroupBusinessException 方法,对参数 Throwable ex 做空判断,解决分布式事务超时,引起 ex.getCause() 空指针,
* 从而事务无法正常结束,导致后续请求一直卡住,一直抛空指针异常
* Date: 2019/09/25
*
* @author shenjh
*
* Description: 分布式事务异常处理器类
* Date: 2018/12/20
*
* @author ujued
* @see DTXExceptionHandler
*
*/
@Component
@Slf4j
public class DefaultDTXExceptionHandler implements DTXExceptionHandler {
private static final TxLogger txLogger = TxLogger.newLogger(DefaultDTXExceptionHandler.class);
private final TransactionCleanTemplate transactionCleanTemplate;
private final TMReporter tmReporter;
@Autowired
public DefaultDTXExceptionHandler(TransactionCleanTemplate transactionCleanTemplate, TMReporter tmReporter) {
this.transactionCleanTemplate = transactionCleanTemplate;
this.tmReporter = tmReporter;
}
@Override
public void handleCreateGroupBusinessException(Object params, Throwable ex) throws TransactionException {
throw new TransactionException(ex);
}
@Override
public void handleCreateGroupMessageException(Object params, Throwable ex) throws TransactionException {
throw new TransactionException(ex);
}
@Override
public void handleJoinGroupBusinessException(Object params, Throwable ex) throws TransactionException {
List paramList = (List) params;
String groupId = (String) paramList.get(0);
String unitId = (String) paramList.get(1);
String unitType = (String) paramList.get(2);
try {
transactionCleanTemplate.clean(groupId, unitId, unitType, 0);
} catch (TransactionClearException e) {
txLogger.error(groupId, unitId, "join group", "clean [{}]transaction fail.", unitType);
}
throw new TransactionException(ex);
}
@Override
public void handleJoinGroupMessageException(Object params, Throwable ex) throws TransactionException {
throw new TransactionException(ex);
}
@Override
public void handleNotifyGroupBusinessException(Object params, Throwable ex) {
List paramList = (List) params;
String groupId = (String) paramList.get(0);
int state = (int) paramList.get(1);
String unitId = (String) paramList.get(2);
String transactionType = (String) paramList.get(3);
if (ex == null) { // add by shenjh 20190925 增加空判断
// log.error("分布式事务超时! Note by shenjh");
/*
见:LcnConnectionProxy.RpcResponseState notify(int state)
if (state == 1) {
log.debug("commit transaction type[lcn] proxy connection:{}.", this);
connection.commit();
} else {
log.debug("rollback transaction type[lcn] proxy connection:{}.", this);
connection.rollback();
}
*/
state = 0; // 分布式事务回滚
} else {
//用户强制回滚.
if (ex instanceof UserRollbackException) {
state = 0;
}
if ((ex.getCause() != null && ex.getCause() instanceof UserRollbackException)) {
state = 0;
}
}
// 结束事务
try {
transactionCleanTemplate.clean(groupId, unitId, transactionType, state);
} catch (TransactionClearException e) {
txLogger.error(groupId, unitId, "notify group", "{} > clean transaction error.", transactionType);
}
if (ex == null) { // add by shenjh 20190925 增加抛出分布式超时异常
throw new TxlcnTimeoutException("分布式事务超时! Note by shenjh");
}
}
@Override
public void handleNotifyGroupMessageException(Object params, Throwable ex) {
// 当0 时候
List paramList = (List) params;
String groupId = (String) paramList.get(0);
int state = (int) paramList.get(1);
if (state == 0) {
handleNotifyGroupBusinessException(params, ex);
return;
}
// 按状态正常结束事务(切面补偿记录将保留)
// TxManager 存在请求异常或者响应异常两种情况。当请求异常时这里的业务需要补偿,当响应异常的时候需要做状态的事务清理。
// 请求异常时
// 参与放会根据上报补偿记录做事务的提交。
// 响应异常时
// 参与反会正常提交事务,本地业务提示事务。
// 该两种情况下补偿信息均可以忽略,可直接把本地补偿记录数据删除。
String unitId = (String) paramList.get(2);
String transactionType = (String) paramList.get(3);
try {
transactionCleanTemplate.cleanWithoutAspectLog(groupId, unitId, transactionType, state);
} catch (TransactionClearException e) {
txLogger.error(groupId, unitId, "notify group", "{} > cleanWithoutAspectLog transaction error.", transactionType);
}
// 上报Manager,上报直到成功.
tmReporter.reportTransactionState(groupId, null, TxExceptionParams.NOTIFY_GROUP_ERROR, state);
}
}
package com.codingapi.txlcn.common.exception;
import java.io.Serializable;
/**
* 分布式事务超时异常
* @author shenjh
* @version 1.0
* @since 2019-09-25 16:45
*/
public class TxlcnTimeoutException extends RuntimeException implements Serializable {
private static final long serialVersionUID = 1L;
public TxlcnTimeoutException(String message) {
super(message);
}
public TxlcnTimeoutException(Throwable ex) {
super(ex);
}
public TxlcnTimeoutException() {
}
}
5.02 TM查看异常记录会往数据库插入相同的两条记录且用官方demo会出现上述空指针异常。
今天也遇到了这个问题,我使用的是 5.0.2 版本,采用的分布式事务是:
@LcnTransaction(propagation = DTXPropagation.REQUIRED) @LcnTransaction(propagation = DTXPropagation.SUPPORTS)
跟踪源码发现我这边是由于分布式事务超时导致的空指针异常,问题分析及解决办法:
tx-lcn: manager: # 分布式事务超时时间(ms),需要大于[(微服务调用链长度 e) * (hystrix 超时时间) + N(多次跨服务调用))时间] # 否则会因为超时导致 com.codingapi.txlcn.tc.core.checking.DefaultDTXExceptionHandler.handleNotifyGroupBusinessException 的Throwable ex 参数为空, # 从而导致抛空指针异常,从而导致“结束事务”没执行,事务没结束,导致后续的请求一直卡在那里,即使接收的服务重启也没用 # 异常重现步骤: # 1、接收服务 B ,设置 ribbon.ReadTimeout 为 10 秒,接收接口 Thread.sleep(9 * 1000) 9 秒 # 1、启动请求服务 A,启动接收服务 B # 2、确保两个服务均已启动成功,且成功注册到Eureka ,且 spring 的 Gateway 网关可以正常转发请求 # 3、服务 A 发起请求,通过 FeignClient 调用服务 B 的接口 # 4、此时该接口的等待时间会超过 8 秒(分布式事务默认超时时间为 8 秒),从而导致分布式事务超时 # 5、通过跟踪分布式事务超时处理源码 # TransactionControlTemplate.notifyGroup { # if (globalContext.isDTXTimeout()) { # throw new LcnBusinessException("dtx timeout."); # } # ... # catch (LcnBusinessException e) { # // 此时的 e.getCause() 是 null ,会导致 dtxExceptionHandler.handleNotifyGroupBusinessException 抛空指针异常 # dtxExceptionHandler.handleNotifyGroupBusinessException(Arrays.asList(groupId, state, unitId, transactionType), e.getCause()); # } # } # # DefaultDTXExceptionHandler.handleNotifyGroupBusinessException { # ... # if ((ex.getCause() != null && ex.getCause() instanceof UserRollbackException)) // 此段代码抛空指针异常 # ... # transactionCleanTemplate.clean(groupId, unitId, transactionType, state); // 导致本段代码无法执行,无法正常结束事务 # } # 6、此时就会出现分布式事务超时异常,从而导致分布式事务无法正常结束,然后后续的所有请求都会卡在这,一直报这个异常,即使服务 B 重启也没用,需要重启服务 A 才可以 # # 由于需要设置的(分布式事务超时时间(ms))无法直接确定确定,因此可进一步优化: # 1、设置(分布式事务超时时间(ms))大于等于 [1 * (hystrix 超时时间)] # 2、重写 DefaultDTXExceptionHandler.handleNotifyGroupBusinessException 对 ex 做空判断,确保事务能正常结束,然后抛出分布式事物超时异常,以便获知是分布式事务超时的情况 # 3、确保重写的 DefaultDTXExceptionHandler 优于 txlcn-tc-5.0.2.RELEASE.jar 被加载 dtx-time: 35000
重写代码如下:
/* * Copyright 2017-2019 CodingApi . * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. * You may obtain a copy of the License at * * http://www.apache.org/licenses/LICENSE-2.0 * * Unless required by applicable law or agreed to in writing, software * distributed under the License is distributed on an "AS IS" BASIS, * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. * See the License for the specific language governing permissions and * limitations under the License. */ package com.codingapi.txlcn.tc.core.checking; import com.codingapi.txlcn.common.exception.*; import com.codingapi.txlcn.logger.TxLogger; import com.codingapi.txlcn.tc.txmsg.TMReporter; import com.codingapi.txlcn.tc.core.template.TransactionCleanTemplate; import com.codingapi.txlcn.txmsg.params.TxExceptionParams; import lombok.extern.slf4j.Slf4j; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.stereotype.Component; import java.util.List; /** * Description: * 重写 handleNotifyGroupBusinessException 方法,对参数 Throwable ex 做空判断,解决分布式事务超时,引起 ex.getCause() 空指针, * 从而事务无法正常结束,导致后续请求一直卡住,一直抛空指针异常 * Date: 2019/09/25 * * @author shenjh * * Description: 分布式事务异常处理器类 * Date: 2018/12/20 * * @author ujued * @see DTXExceptionHandler * */ @Component @Slf4j public class DefaultDTXExceptionHandler implements DTXExceptionHandler { private static final TxLogger txLogger = TxLogger.newLogger(DefaultDTXExceptionHandler.class); private final TransactionCleanTemplate transactionCleanTemplate; private final TMReporter tmReporter; @Autowired public DefaultDTXExceptionHandler(TransactionCleanTemplate transactionCleanTemplate, TMReporter tmReporter) { this.transactionCleanTemplate = transactionCleanTemplate; this.tmReporter = tmReporter; } @Override public void handleCreateGroupBusinessException(Object params, Throwable ex) throws TransactionException { throw new TransactionException(ex); } @Override public void handleCreateGroupMessageException(Object params, Throwable ex) throws TransactionException { throw new TransactionException(ex); } @Override public void handleJoinGroupBusinessException(Object params, Throwable ex) throws TransactionException { List paramList = (List) params; String groupId = (String) paramList.get(0); String unitId = (String) paramList.get(1); String unitType = (String) paramList.get(2); try { transactionCleanTemplate.clean(groupId, unitId, unitType, 0); } catch (TransactionClearException e) { txLogger.error(groupId, unitId, "join group", "clean [{}]transaction fail.", unitType); } throw new TransactionException(ex); } @Override public void handleJoinGroupMessageException(Object params, Throwable ex) throws TransactionException { throw new TransactionException(ex); } @Override public void handleNotifyGroupBusinessException(Object params, Throwable ex) { List paramList = (List) params; String groupId = (String) paramList.get(0); int state = (int) paramList.get(1); String unitId = (String) paramList.get(2); String transactionType = (String) paramList.get(3); if (ex == null) { // add by shenjh 20190925 增加空判断 // log.error("分布式事务超时! Note by shenjh"); /* 见:LcnConnectionProxy.RpcResponseState notify(int state) if (state == 1) { log.debug("commit transaction type[lcn] proxy connection:{}.", this); connection.commit(); } else { log.debug("rollback transaction type[lcn] proxy connection:{}.", this); connection.rollback(); } */ state = 0; // 分布式事务回滚 } else { //用户强制回滚. if (ex instanceof UserRollbackException) { state = 0; } if ((ex.getCause() != null && ex.getCause() instanceof UserRollbackException)) { state = 0; } } // 结束事务 try { transactionCleanTemplate.clean(groupId, unitId, transactionType, state); } catch (TransactionClearException e) { txLogger.error(groupId, unitId, "notify group", "{} > clean transaction error.", transactionType); } if (ex == null) { // add by shenjh 20190925 增加抛出分布式超时异常 throw new TxlcnTimeoutException("分布式事务超时! Note by shenjh"); } } @Override public void handleNotifyGroupMessageException(Object params, Throwable ex) { // 当0 时候 List paramList = (List) params; String groupId = (String) paramList.get(0); int state = (int) paramList.get(1); if (state == 0) { handleNotifyGroupBusinessException(params, ex); return; } // 按状态正常结束事务(切面补偿记录将保留) // TxManager 存在请求异常或者响应异常两种情况。当请求异常时这里的业务需要补偿,当响应异常的时候需要做状态的事务清理。 // 请求异常时 // 参与放会根据上报补偿记录做事务的提交。 // 响应异常时 // 参与反会正常提交事务,本地业务提示事务。 // 该两种情况下补偿信息均可以忽略,可直接把本地补偿记录数据删除。 String unitId = (String) paramList.get(2); String transactionType = (String) paramList.get(3); try { transactionCleanTemplate.cleanWithoutAspectLog(groupId, unitId, transactionType, state); } catch (TransactionClearException e) { txLogger.error(groupId, unitId, "notify group", "{} > cleanWithoutAspectLog transaction error.", transactionType); } // 上报Manager,上报直到成功. tmReporter.reportTransactionState(groupId, null, TxExceptionParams.NOTIFY_GROUP_ERROR, state); } }
package com.codingapi.txlcn.common.exception; import java.io.Serializable; /** * 分布式事务超时异常 * @author shenjh * @version 1.0 * @since 2019-09-25 16:45 */ public class TxlcnTimeoutException extends RuntimeException implements Serializable { private static final long serialVersionUID = 1L; public TxlcnTimeoutException(String message) { super(message); } public TxlcnTimeoutException(Throwable ex) { super(ex); } public TxlcnTimeoutException() { } }
分布式事务执行总时间(ms). 默认为36000
tx-lcn.manager.dtx-time=36000 调大TM时间,空指针异常恢复