Archery icon indicating copy to clipboard operation
Archery copied to clipboard

[ bug ] 阿里云mongodb无法闯将索引

Open czxin788 opened this issue 3 years ago • 4 comments

重现步骤

提交两个sql报错 image

image

单独提交一个sql就没问题: image

版本信息

应用版本/分支:

部署方式:Docker、手工部署

czxin788 avatar Jul 21 '21 11:07 czxin788

你的版本是多少,实例是怎么配置的?django_q启动了吗?看一下后台日志报错?

fancy-lee avatar Aug 31 '21 02:08 fancy-lee

遇到同样的问题,应该是云数据库导致的,获取到的slave信息为云的ip与端口,不能直连

创建补丁 mongo.py.20211117.patch

diff --git a/sql/engines/mongo.py b/sql/engines/mongo.py
index 8e7143f..eef5e1a 100644
--- a/sql/engines/mongo.py
+++ b/sql/engines/mongo.py
@@ -295,9 +295,10 @@ class MongoEngine(EngineBase):
         slave_msg = self.exec_cmd(sql)
         if slave_msg.lower().find('undefined') < 0:
             sp_host = slave_msg.replace("\"", "").split(":")
-            self.host = sp_host[0]
-            self.port = int(sp_host[1])
-            return True
+            #self.host = sp_host[0]
+            #self.port = int(sp_host[1])
+            #return True
+            return False
         else:
             return False

应用补丁 patch -p1 < mongo.py.20211117.patch

weideguo avatar Nov 17 '21 07:11 weideguo

我现在升级到archery 1.9.0,还是报上面这个错误,搜了一下issue,原来以前旧版本就有这个问题,应该把你的代码合并到master分支, @weideguo image

czxin788 avatar Aug 31 '22 07:08 czxin788

经过确认,产生该问题的原因,正是weideguo所说,只有云mongodb才会发生。 发生该问题的场景是,同时写两行和两行以上createIndex才会发生,别的dml语句不会发生这个问题。 在执行createIndex语句时,mongo.py脚本要通过备节点去获取表行数,如果表行数大于500万,会提醒不能在高峰期创建索引。

获取备节点ip的sql语句是:

var host=""; rs.status().members.forEach(function(item) {i=1; if (item.stateStr =="SECONDARY") \
        {host=item.name } }); print(host);

对于自建mongodb,是可以正常获取到备节点ip的端口的,如下name字段:

rs.status().members
	{
		"_id" : 5,
		"name" : "192.168.1.33:27018",
		"health" : 1,
		"state" : 2,
		"stateStr" : "SECONDARY",
		"uptime" : 371036,
		"optime" : {
			"ts" : Timestamp(1662533428, 1),
			"t" : NumberLong(15)
		},
		"optimeDurable" : {
			"ts" : Timestamp(1662533428, 1),
			"t" : NumberLong(15)
		},
		"optimeDate" : ISODate("2022-09-07T06:50:28Z"),
		"optimeDurableDate" : ISODate("2022-09-07T06:50:28Z"),
		"lastHeartbeat" : ISODate("2022-09-07T06:50:37.258Z"),
		"lastHeartbeatRecv" : ISODate("2022-09-07T06:50:37.256Z"),
		"pingMs" : NumberLong(0),
		"lastHeartbeatMessage" : "",
		"syncingTo" : "192.168.1.33:27017",
		"syncSourceHost" : "192.168.1.33:27017",
		"syncSourceId" : 4,
		"infoMessage" : "",
		"configVersion" : 7
	},
	{
		"_id" : 6,
		"name" : "192.168.1.33:27019",
		"health" : 1,
		"state" : 2,
		"stateStr" : "SECONDARY",
		"uptime" : 370935,
		"optime" : {
			"ts" : Timestamp(1662533428, 1),
			"t" : NumberLong(15)
		},
		"optimeDurable" : {
			"ts" : Timestamp(1662533428, 1),
			"t" : NumberLong(15)
		},
		"optimeDate" : ISODate("2022-09-07T06:50:28Z"),
		"optimeDurableDate" : ISODate("2022-09-07T06:50:28Z"),
		"lastHeartbeat" : ISODate("2022-09-07T06:50:37.258Z"),
		"lastHeartbeatRecv" : ISODate("2022-09-07T06:50:37.254Z"),
		"pingMs" : NumberLong(0),
		"lastHeartbeatMessage" : "",
		"syncingTo" : "192.168.1.33:27017",
		"syncSourceHost" : "192.168.1.33:27017",
		"syncSourceId" : 4,
		"infoMessage" : "",
		"configVersion" : 7

可是,对于云mongodb,获取到的name字段,就不是真实的备节点ip地址了,如下:

 rs.status().members
	{
		"_id" : 1,
		"name" : "secondaryNode",
		"health" : 1,
		"state" : 2,
		"stateStr" : "SECONDARY",
		"uptime" : 55036037,
		"optime" : {
			"ts" : Timestamp(1662533378, 1),
			"t" : NumberLong(1)
		},
		"optimeDurable" : {
			"ts" : Timestamp(1662533378, 1),
			"t" : NumberLong(1)
		},
		"optimeDate" : ISODate("2022-09-07T06:49:38Z"),
		"optimeDurableDate" : ISODate("2022-09-07T06:49:38Z"),
		"lastHeartbeat" : ISODate("2022-09-07T06:49:42.818Z"),
		"lastHeartbeatRecv" : ISODate("2022-09-07T06:49:41.451Z"),
		"pingMs" : NumberLong(0),
		"lastHeartbeatMessage" : "",
		"syncingTo" : "",
		"syncSourceHost" : "",
		"syncSourceId" : -1,
		"infoMessage" : "",
		"configVersion" : 1
	},
	{
		"_id" : 2,
		"name" : "hiddenNode",
		"health" : 1,
		"state" : 2,
		"stateStr" : "SECONDARY",
		"uptime" : 55036037,
		"optime" : {
			"ts" : Timestamp(1662533378, 1),
			"t" : NumberLong(1)
		},
		"optimeDurable" : {
			"ts" : Timestamp(1662533378, 1),
			"t" : NumberLong(1)
		},
		"optimeDate" : ISODate("2022-09-07T06:49:38Z"),
		"optimeDurableDate" : ISODate("2022-09-07T06:49:38Z"),
		"lastHeartbeat" : ISODate("2022-09-07T06:49:41.670Z"),
		"lastHeartbeatRecv" : ISODate("2022-09-07T06:49:42.383Z"),
		"pingMs" : NumberLong(0),
		"lastHeartbeatMessage" : "",
		"syncingTo" : "",
		"syncSourceHost" : "",

看到云mongodb获取到的name值是secondaryNode和hiddenNode,这样archery就办法连上备节点执行sql语句了。

为了解决云mongodb不能创建索引的问题,就按照 @weideguo 的方法,把sql/engines/mongo.py的get_salve函数:

 def get_slave(self):
        """获得从节点的port和host"""

        sql = """var host=""; rs.status().members.forEach(function(item) {i=1; if (item.stateStr =="SECONDARY") \
        {host=item.name } }); print(host);"""
        slave_msg = self.exec_cmd(sql)
        if slave_msg.lower().find("undefined") < 0:
            sp_host = slave_msg.replace('"', "").split(":")
            self.host = sp_host[0]
            self.port = int(sp_host[1])
            return True
        else:
            return False

改成

     def get_slave(self):
        """获得从节点的port和host"""

        sql = """var host=""; rs.status().members.forEach(function(item) {i=1; if (item.stateStr =="SECONDARY") \
        {host=item.name } }); print(host);"""
        slave_msg = self.exec_cmd(sql)
        # 如果是阿里云的云mongodb,就不获取备节点ip和端口了,因为获取不到真实的备节点ip,那就干脆直接用主节点来执行sql
        if slave_msg.strip() == 'SECONDARY' or slave_msg.strip() == 'hiddenNode':
            return False
        if slave_msg.lower().find("undefined") < 0:
            sp_host = slave_msg.replace('"', "").split(":")
            self.host = sp_host[0]
            logger.warning(self.host + '  5555')
            self.port = int(sp_host[1])
            return True
        else:
            return False

czxin788 avatar Sep 07 '22 07:09 czxin788

但是华为云获取的member信息是ip:port方式,这个ip在vpc网络是访问不到的,修复方法使用":"判断,华为云就不适用了 image

Aaron199 avatar Apr 13 '23 10:04 Aaron199