Archery
Archery copied to clipboard
[ bug ] 阿里云mongodb无法闯将索引
重现步骤
提交两个sql报错
单独提交一个sql就没问题:
版本信息
应用版本/分支:
部署方式:Docker、手工部署
你的版本是多少,实例是怎么配置的?django_q启动了吗?看一下后台日志报错?
遇到同样的问题,应该是云数据库导致的,获取到的slave信息为云的ip与端口,不能直连
创建补丁 mongo.py.20211117.patch
diff --git a/sql/engines/mongo.py b/sql/engines/mongo.py
index 8e7143f..eef5e1a 100644
--- a/sql/engines/mongo.py
+++ b/sql/engines/mongo.py
@@ -295,9 +295,10 @@ class MongoEngine(EngineBase):
slave_msg = self.exec_cmd(sql)
if slave_msg.lower().find('undefined') < 0:
sp_host = slave_msg.replace("\"", "").split(":")
- self.host = sp_host[0]
- self.port = int(sp_host[1])
- return True
+ #self.host = sp_host[0]
+ #self.port = int(sp_host[1])
+ #return True
+ return False
else:
return False
应用补丁 patch -p1 < mongo.py.20211117.patch
我现在升级到archery 1.9.0,还是报上面这个错误,搜了一下issue,原来以前旧版本就有这个问题,应该把你的代码合并到master分支, @weideguo
经过确认,产生该问题的原因,正是weideguo所说,只有云mongodb才会发生。 发生该问题的场景是,同时写两行和两行以上createIndex才会发生,别的dml语句不会发生这个问题。 在执行createIndex语句时,mongo.py脚本要通过备节点去获取表行数,如果表行数大于500万,会提醒不能在高峰期创建索引。
获取备节点ip的sql语句是:
var host=""; rs.status().members.forEach(function(item) {i=1; if (item.stateStr =="SECONDARY") \
{host=item.name } }); print(host);
对于自建mongodb,是可以正常获取到备节点ip的端口的,如下name字段:
rs.status().members
{
"_id" : 5,
"name" : "192.168.1.33:27018",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 371036,
"optime" : {
"ts" : Timestamp(1662533428, 1),
"t" : NumberLong(15)
},
"optimeDurable" : {
"ts" : Timestamp(1662533428, 1),
"t" : NumberLong(15)
},
"optimeDate" : ISODate("2022-09-07T06:50:28Z"),
"optimeDurableDate" : ISODate("2022-09-07T06:50:28Z"),
"lastHeartbeat" : ISODate("2022-09-07T06:50:37.258Z"),
"lastHeartbeatRecv" : ISODate("2022-09-07T06:50:37.256Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "192.168.1.33:27017",
"syncSourceHost" : "192.168.1.33:27017",
"syncSourceId" : 4,
"infoMessage" : "",
"configVersion" : 7
},
{
"_id" : 6,
"name" : "192.168.1.33:27019",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 370935,
"optime" : {
"ts" : Timestamp(1662533428, 1),
"t" : NumberLong(15)
},
"optimeDurable" : {
"ts" : Timestamp(1662533428, 1),
"t" : NumberLong(15)
},
"optimeDate" : ISODate("2022-09-07T06:50:28Z"),
"optimeDurableDate" : ISODate("2022-09-07T06:50:28Z"),
"lastHeartbeat" : ISODate("2022-09-07T06:50:37.258Z"),
"lastHeartbeatRecv" : ISODate("2022-09-07T06:50:37.254Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "192.168.1.33:27017",
"syncSourceHost" : "192.168.1.33:27017",
"syncSourceId" : 4,
"infoMessage" : "",
"configVersion" : 7
可是,对于云mongodb,获取到的name字段,就不是真实的备节点ip地址了,如下:
rs.status().members
{
"_id" : 1,
"name" : "secondaryNode",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 55036037,
"optime" : {
"ts" : Timestamp(1662533378, 1),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1662533378, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2022-09-07T06:49:38Z"),
"optimeDurableDate" : ISODate("2022-09-07T06:49:38Z"),
"lastHeartbeat" : ISODate("2022-09-07T06:49:42.818Z"),
"lastHeartbeatRecv" : ISODate("2022-09-07T06:49:41.451Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "",
"syncSourceHost" : "",
"syncSourceId" : -1,
"infoMessage" : "",
"configVersion" : 1
},
{
"_id" : 2,
"name" : "hiddenNode",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 55036037,
"optime" : {
"ts" : Timestamp(1662533378, 1),
"t" : NumberLong(1)
},
"optimeDurable" : {
"ts" : Timestamp(1662533378, 1),
"t" : NumberLong(1)
},
"optimeDate" : ISODate("2022-09-07T06:49:38Z"),
"optimeDurableDate" : ISODate("2022-09-07T06:49:38Z"),
"lastHeartbeat" : ISODate("2022-09-07T06:49:41.670Z"),
"lastHeartbeatRecv" : ISODate("2022-09-07T06:49:42.383Z"),
"pingMs" : NumberLong(0),
"lastHeartbeatMessage" : "",
"syncingTo" : "",
"syncSourceHost" : "",
看到云mongodb获取到的name值是secondaryNode和hiddenNode,这样archery就办法连上备节点执行sql语句了。
为了解决云mongodb不能创建索引的问题,就按照 @weideguo 的方法,把sql/engines/mongo.py的get_salve函数:
def get_slave(self):
"""获得从节点的port和host"""
sql = """var host=""; rs.status().members.forEach(function(item) {i=1; if (item.stateStr =="SECONDARY") \
{host=item.name } }); print(host);"""
slave_msg = self.exec_cmd(sql)
if slave_msg.lower().find("undefined") < 0:
sp_host = slave_msg.replace('"', "").split(":")
self.host = sp_host[0]
self.port = int(sp_host[1])
return True
else:
return False
改成
def get_slave(self):
"""获得从节点的port和host"""
sql = """var host=""; rs.status().members.forEach(function(item) {i=1; if (item.stateStr =="SECONDARY") \
{host=item.name } }); print(host);"""
slave_msg = self.exec_cmd(sql)
# 如果是阿里云的云mongodb,就不获取备节点ip和端口了,因为获取不到真实的备节点ip,那就干脆直接用主节点来执行sql
if slave_msg.strip() == 'SECONDARY' or slave_msg.strip() == 'hiddenNode':
return False
if slave_msg.lower().find("undefined") < 0:
sp_host = slave_msg.replace('"', "").split(":")
self.host = sp_host[0]
logger.warning(self.host + ' 5555')
self.port = int(sp_host[1])
return True
else:
return False
但是华为云获取的member信息是ip:port方式,这个ip在vpc网络是访问不到的,修复方法使用":"判断,华为云就不适用了