huaweicloud-iot-device-sdk-java icon indicating copy to clipboard operation
huaweicloud-iot-device-sdk-java copied to clipboard

设备连接错误,导致无限重连

Open xiaobingzhou opened this issue 4 years ago • 2 comments

Bug出现版本:1.0.0版本

  • Bug复现步骤

运行iot-device-demo中PropertySample类出现错误日志connect failed 错误的用户名或密码 (4) 之后就无限重连了

2020-11-14 13:45:13  INFO AbstractDevice:54 - create device: 5e06bfee334dd4f33759f5b3_demo
2020-11-14 13:45:15  INFO MqttConnection:153 - try to connect to ssl://iot-mqtts.cn-north-4.myhuaweicloud.com:8883
2020-11-14 13:45:17  INFO MqttConnection:170 - connect failed 错误的用户名或密码 (4)
2020-11-14 13:45:18  INFO MqttConnection:153 - try to connect to ssl://iot-mqtts.cn-north-4.myhuaweicloud.com:8883
2020-11-14 13:45:19  INFO MqttConnection:170 - connect failed 错误的用户名或密码 (4)
2020-11-14 13:45:20  INFO MqttConnection:153 - try to connect to ssl://iot-mqtts.cn-north-4.myhuaweicloud.com:8883
2020-11-14 13:45:21  INFO MqttConnection:170 - connect failed 错误的用户名或密码 (4)
2020-11-14 13:45:22  INFO MqttConnection:153 - try to connect to ssl://iot-mqtts.cn-north-4.myhuaweicloud.com:8883
2020-11-14 13:45:23  INFO MqttConnection:170 - connect failed 错误的用户名或密码 (4)
2020-11-14 13:45:24  INFO MqttConnection:153 - try to connect to ssl://iot-mqtts.cn-north-4.myhuaweicloud.com:8883
2020-11-14 13:45:25  INFO MqttConnection:170 - connect failed 错误的用户名或密码 (4)
2020-11-14 13:45:26  INFO MqttConnection:153 - try to connect to ssl://iot-mqtts.cn-north-4.myhuaweicloud.com:8883
2020-11-14 13:45:27  INFO MqttConnection:170 - connect failed 错误的用户名或密码 (4)
2020-11-14 13:45:28  INFO MqttConnection:153 - try to connect to ssl://iot-mqtts.cn-north-4.myhuaweicloud.com:8883
2020-11-14 13:45:29  INFO MqttConnection:170 - connect failed 错误的用户名或密码 (4)
  • 分析

0.8.0版本没有这个bug,对比两个版本的 com.huaweicloud.sdk.iot.device.client.DeviceClient#connect方法

0.8.0版本还没有重连功能

   # 0.8.0  com.huaweicloud.sdk.iot.device.client.DeviceClient#connect
   /**
     * 和平台建立连接,此接口为阻塞调用,超时时长20s
     *
     * @return 0表示连接成功,其他表示连接失败
     */
    protected int connect() {
        int ret = connection.connect();
        if (ret != 0) {
            return ret;
        }

        connection.subscribeTopic("$oc/devices/" + clientConf.getDeviceId() + "/sys/messages/down", null);
        connection.subscribeTopic("$oc/devices/" + clientConf.getDeviceId() + "/sys/commands/#", null);
        connection.subscribeTopic("$oc/devices/" + clientConf.getDeviceId() + "/sys/properties/set/#", null);
        connection.subscribeTopic("$oc/devices/" + clientConf.getDeviceId() + "/sys/properties/get/#", null);
        connection.subscribeTopic("$oc/devices/" + clientConf.getDeviceId() + "/sys/shadow/get/response/#", null);
        connection.subscribeTopic("$oc/devices/" + clientConf.getDeviceId() + "/sys/events/down", null);

        return ret;
    }

发现在1.0.0版本添加了重连功能

   # 1.0.0  com.huaweicloud.sdk.iot.device.client.DeviceClient#connect
   /**
     * 和平台建立连接,此接口为阻塞调用,超时时长60s。连接成功时,SDK会自动向平台订阅系统定义的topic。
     *
     * @return 0表示连接成功,其他表示连接失败
     */
    public int connect() {

        synchronized (this) {
            if (executorService == null) {
                executorService = Executors.newScheduledThreadPool(ClientThreadCount);
            }
        }

        int ret = connection.connect();

        //退避机制重连
        while (ret != 0) {
            connectFailedTime++;
            try {
                if (connectFailedTime < 10) {
                    Thread.sleep(1000);
                } else if (connectFailedTime < 50) {
                    Thread.sleep(5000);
                } else {
                    Thread.sleep(120000);
                }
                this.connection = new MqttConnection(clientConf, this);
                ret = connection.connect();
            } catch (InterruptedException e) {
                log.debug("connect failed" + connectFailedTime + "times");
            }
        }

        connectFailedTime = 0;

        return ret;
    }

while循环中的MqttConnection.connect()方法内部已将异常捕获,不会抛出,造成while死循环

# com.huaweicloud.sdk.iot.device.transport.mqtt.MqttConnection#connect
public int connect() {

        try {
            // 中间省略
        } catch (MqttException e) {
            log.error(ExceptionUtil.getBriefStackTrace(e));

        }

        return mqttAsyncClient.isConnected() ? 0 : -1;
    }

xiaobingzhou avatar Nov 14 '20 06:11 xiaobingzhou

您好,感谢反馈。1.0.0加了一个退避重连机制,这块后续会优化一版,待修改完后,麻烦帮忙看下是否能解决您的问题。

louiscrazy avatar Nov 18 '20 09:11 louiscrazy

好的,感谢回复,再麻烦您帮忙看下这个问题

  • 问题:手动调用关闭连接方法 com.huaweicloud.sdk.iot.device.client.DeviceClient#close 只是关闭了MQTT连接, 但是 com.huaweicloud.sdk.iot.device.client.DeviceClient#executorService 这个单线程池没有关闭(每次成功连接一个设备都会有一个单线程池被创建com.huaweicloud.sdk.iot.device.client.DeviceClient#connect),有内存泄露的风险

  • 建议:看下能不能提供一个setter方法 com.huaweicloud.sdk.iot.device.client.DeviceClient#setExecutorService,让调用方手动设置一个线程池

xiaobingzhou avatar Nov 18 '20 10:11 xiaobingzhou